cartoframes package¶
Subpackages¶
Submodules¶
- cartoframes.analysis module
- cartoframes.columns module
- cartoframes.context module
- cartoframes.credentials module
- cartoframes.dataobs module
- cartoframes.dataset module
- cartoframes.examples module
- cartoframes.geojson module
- cartoframes.layer module
- cartoframes.maps module
- cartoframes.styling module
- cartoframes.utils module
Module contents¶
-
class
cartoframes.
CartoContext
(base_url=None, api_key=None, creds=None, session=None, verbose=0)¶ Bases:
object
CartoContext class for authentication with CARTO and high-level operations such as reading tables from CARTO into dataframes, writing dataframes to CARTO tables, creating custom maps from dataframes and CARTO tables, and augmenting data using CARTO’s Data Observatory. Future methods will interact with CARTO’s services like routing, geocoding, and isolines, PostGIS backend for spatial processing, and much more.
Manages connections with CARTO for data and map operations. Modeled after SparkContext.
There are two ways of authenticating against a CARTO account:
Setting the base_url and api_key directly in
CartoContext
. This method is easier.:cc = CartoContext( base_url='https://eschbacher.carto.com', api_key='abcdefg')
By passing a
Credentials
instance inCartoContext
’screds
keyword argument. This method is more flexible.:from cartoframes import Credentials creds = Credentials(username='eschbacher', key='abcdefg') cc = CartoContext(creds=creds)
-
creds
¶ Credentials
instanceType: Credentials
Parameters: - base_url (str) – Base URL of CARTO user account. Cloud-based accounts
should use the form
https://{username}.carto.com
(e.g., https://eschbacher.carto.com for usereschbacher
) whether on a personal or multi-user account. On-premises installation users should ask their admin. - api_key (str) – CARTO API key.
- creds (
Credentials
) – ACredentials
instance can be used in place of a base_url/api_key combination. - session (requests.Session, optional) – requests session. See requests documentation for more information.
- verbose (bool, optional) – Output underlying process states (True), or suppress (False, default)
Returns: A CartoContext object that is authenticated against the user’s CARTO account.
Return type: Example
Create a
CartoContext
object for a cloud-based CARTO account.import cartoframes # if on prem, format is '{host}/user/{username}' BASEURL = 'https://{}.carto.com/'.format('your carto username') APIKEY = 'your carto api key' cc = cartoframes.CartoContext(BASEURL, APIKEY)
Tip
If using cartoframes with an on premises CARTO installation, sometimes it is necessary to disable SSL verification depending on your system configuration. You can do this using a requests Session object as follows:
import cartoframes from requests import Session session = Session() session.verify = False # on prem host (e.g., an IP address) onprem_host = 'your on prem carto host' cc = cartoframes.CartoContext( base_url='{host}/user/{user}'.format( host=onprem_host, user='your carto username'), api_key='your carto api key', session=session )
-
data
(table_name, metadata, persist_as=None, how='the_geom')¶ Get an augmented CARTO dataset with Data Observatory measures. Use CartoContext.data_discovery to search for available measures, or see the full Data Observatory catalog. Optionally persist the data as a new table.
Example
Get a DataFrame with Data Observatory measures based on the geometries in a CARTO table.
cc = cartoframes.CartoContext(BASEURL, APIKEY) median_income = cc.data_discovery('transaction_events', regex='.*median income.*', time='2011 - 2015') df = cc.data('transaction_events', median_income)
Pass in cherry-picked measures from the Data Observatory catalog. The rest of the metadata will be filled in, but it’s important to specify the geographic level as this will not show up in the column name.
median_income = [{'numer_id': 'us.census.acs.B19013001', 'geom_id': 'us.census.tiger.block_group', 'numer_timespan': '2011 - 2015'}] df = cc.data('transaction_events', median_income)
Parameters: - table_name (str) – Name of table on CARTO account that Data Observatory measures are to be added to.
- metadata (pandas.DataFrame) – List of all measures to add to
table_name. See
CartoContext.data_discovery
outputs for a full list of metadata columns. - persist_as (str, optional) – Output the results of augmenting
table_name to persist_as as a persistent table on CARTO.
Defaults to
None
, which will not create a table. - how (str, optional) – Not fully implemented. Column name for identifying the geometry from which to fetch the data. Defaults to the_geom, which results in measures that are spatially interpolated (e.g., a neighborhood boundary’s population will be calculated from underlying census tracts). Specifying a column that has the geometry identifier (for example, GEOID for US Census boundaries), results in measures directly from the Census for that GEOID but normalized how it is specified in the metadata.
Returns: A DataFrame representation of table_name which has new columns for each measure in metadata.
Return type: pandas.DataFrame
Raises: - NameError – If the columns in table_name are in the
suggested_name
column of metadata. - ValueError – If metadata object is invalid or empty, or if the number of requested measures exceeds 50.
- CartoException – If user account consumes all of Data Observatory quota
-
data_boundaries
(boundary=None, region=None, decode_geom=False, timespan=None, include_nonclipped=False)¶ Find all boundaries available for the world or a region. If boundary is specified, get all available boundary polygons for the region specified (if any). This method is espeically useful for getting boundaries for a region and, with
CartoContext.data
andCartoContext.data_discovery
, getting tables of geometries and the corresponding raw measures. For example, if you want to analyze how median income has changed in a region (see examples section for more).Examples
Find all boundaries available for Australia. The columns geom_name gives us the name of the boundary and geom_id is what we need for the boundary argument.
import cartoframes cc = cartoframes.CartoContext('base url', 'api key') au_boundaries = cc.data_boundaries(region='Australia') au_boundaries[['geom_name', 'geom_id']]
Get the boundaries for Australian Postal Areas and map them.
from cartoframes import Layer au_postal_areas = cc.data_boundaries(boundary='au.geo.POA') cc.write(au_postal_areas, 'au_postal_areas') cc.map(Layer('au_postal_areas'))
Get census tracts around Idaho Falls, Idaho, USA, and add median income from the US census. Without limiting the metadata, we get median income measures for each census in the Data Observatory.
cc = cartoframes.CartoContext('base url', 'api key') # will return DataFrame with columns `the_geom` and `geom_ref` tracts = cc.data_boundaries( boundary='us.census.tiger.census_tract', region=[-112.096642,43.429932,-111.974213,43.553539]) # write geometries to a CARTO table cc.write(tracts, 'idaho_falls_tracts') # gather metadata needed to look up median income median_income_meta = cc.data_discovery( 'idaho_falls_tracts', keywords='median income', boundaries='us.census.tiger.census_tract') # get median income data and original table as new dataframe idaho_falls_income = cc.data( 'idaho_falls_tracts', median_income_meta, how='geom_refs') # overwrite existing table with newly-enriched dataframe cc.write(idaho_falls_income, 'idaho_falls_tracts', overwrite=True)
Parameters: - boundary (str, optional) – Boundary identifier for the boundaries
that are of interest. For example, US census tracts have a
boundary ID of
us.census.tiger.census_tract
, and Brazilian Municipios have an ID ofbr.geo.municipios
. Find IDs by runningCartoContext.data_boundaries
without any arguments, or by looking in the Data Observatory catalog. - region (str, optional) –
Region where boundary information or, if boundary is specified, boundary polygons are of interest. region can be one of the following:
- table name (str): Name of a table in user’s CARTO account
- bounding box (list of float): List of four values (two
lng/lat pairs) in the following order: western longitude,
southern latitude, eastern longitude, and northern latitude.
For example, Switzerland fits in
[5.9559111595,45.8179931641,10.4920501709,47.808380127]
- timespan (str, optional) – Specific timespan to get geometries from. Defaults to use the most recent. See the Data Observatory catalog for more information.
- decode_geom (bool, optional) – Whether to return the geometries as Shapely objects or keep them encoded as EWKB strings. Defaults to False.
- include_nonclipped (bool, optional) – Optionally include non-shoreline-clipped boundaries. These boundaries are the raw boundaries provided by, for example, US Census Tiger.
Returns: If boundary is specified, then all available boundaries and accompanying geom_refs in region (or the world if region is
None
or not specified) are returned. If boundary is not specified, then a DataFrame of all available boundaries in region (or the world if region isNone
)Return type: pandas.DataFrame
- boundary (str, optional) – Boundary identifier for the boundaries
that are of interest. For example, US census tracts have a
boundary ID of
-
data_discovery
(region, keywords=None, regex=None, time=None, boundaries=None, include_quantiles=False)¶ Discover Data Observatory measures. This method returns the full Data Observatory metadata model for each measure or measures that match the conditions from the inputs. The full metadata in each row uniquely defines a measure based on the timespan, geographic resolution, and normalization (if any). Read more about the metadata response in Data Observatory documentation.
Internally, this method finds all measures in region that match the conditions set in keywords, regex, time, and boundaries (if any of them are specified). Then, if boundaries is not specified, a geographical resolution for that measure will be chosen subject to the type of region specified:
- If region is a table name, then a geographical resolution that is roughly equal to region size / number of subunits.
- If region is a country name or bounding box, then a geographical resolution will be chosen roughly equal to region size / 500.
Since different measures are in some geographic resolutions and not others, different geographical resolutions for different measures are oftentimes returned.
Tip
To remove the guesswork in how geographical resolutions are selected, specify one or more boundaries in boundaries. See the boundaries section for each region in the Data Observatory catalog.
The metadata returned from this method can then be used to create raw tables or for augmenting an existing table from these measures using
CartoContext.data
. For the full Data Observatory catalog, visit https://cartodb.github.io/bigmetadata/. When working with the metadata DataFrame returned from this method, be careful to only remove rows not columns as CartoContext.data <cartoframes.context.CartoContext.data> generally needs the full metadata.Note
Narrowing down a discovery query using the keywords, regex, and time filters is important for getting a manageable metadata set. Besides there being a large number of measures in the DO, a metadata response has acceptable combinations of measures with demonimators (normalization and density), and the same measure from other years.
For example, setting the region to be United States counties with no filter values set will result in many thousands of measures.
Examples
Get all European Union measures that mention
freight
.meta = cc.data_discovery('European Union', keywords='freight', time='2010') print(meta['numer_name'].values)
Parameters: - region (str or list of float) –
Information about the region of interest. region can be one of three types:
- region name (str): Name of region of interest. Acceptable values are limited to: ‘Australia’, ‘Brazil’, ‘Canada’, ‘European Union’, ‘France’, ‘Mexico’, ‘Spain’, ‘United Kingdom’, ‘United States’.
- table name (str): Name of a table in user’s CARTO account
with geometries. The region will be the bounding box of
the table.
Note
If a table name is also a valid Data Observatory region name, the Data Observatory name will be chosen over the table.
- bounding box (list of float): List of four values (two
lng/lat pairs) in the following order: western longitude,
southern latitude, eastern longitude, and northern latitude.
For example, Switzerland fits in
[5.9559111595,45.8179931641,10.4920501709,47.808380127]
Note
Geometry levels are generally chosen by subdividing the region into the next smallest administrative unit. To override this behavior, specify the boundaries flag. For example, set boundaries to
'us.census.tiger.census_tract'
to choose US census tracts. - keywords (str or list of str, optional) – Keyword or list of keywords in measure description or name. Response will be matched on all keywords listed (boolean or).
- regex (str, optional) – A regular expression to search the measure
descriptions and names. Note that this relies on PostgreSQL’s
case insensitive operator
~*
. See PostgreSQL docs for more information. - boundaries (str or list of str, optional) – Boundary or list of boundaries that specify the measure resolution. See the boundaries section for each region in the Data Observatory catalog.
- include_quantiles (bool, optional) – Include quantiles calculations
which are a calculation of how a measure compares to all measures
in the full dataset. Defaults to
False
. IfTrue
, quantiles columns will be returned for each column which has it pre-calculated.
Returns: A dataframe of the complete metadata model for specific measures based on the search parameters.
Return type: pandas.DataFrame
Raises: - ValueError – If region is a
list
and does not consist of four elements, or if region is not an acceptable region - CartoException – If region is not a table in user account
-
delete
(table_name)¶ Delete a table in user’s CARTO account.
Parameters: table_name (str) – Name of table to delete Returns: True if table is removed Return type: bool
-
execute
(query)¶ Runs an arbitrary query to a CARTO account.
This method is specially useful for queries that do not return any data and just perform a database operation like:
- INSERT, UPDATE, DROP, CREATE, ALTER, stored procedures, etc.
Queries are run using a Batch SQL API job in the user account
The execution of the queries is asynchronous but this method automatically waits for its completion (or failure). The job_id of the Batch SQL API job will be printed. In case there’s any issue you can contact the CARTO support team specifying that job_id.
Parameters: query (str) – An SQL query to run against CARTO user database. Returns: None Raises: CartoException – If the query fails to execute Examples
Drops my_table
cc.execute( ''' DROP TABLE my_table ''' )
Updates the column my_column in the table my_table
cc.query( ''' UPDATE my_table SET my_column = 1 ''' )
-
fetch
(query, decode_geom=False)¶ Pull the result from an arbitrary SELECT SQL query from a CARTO account into a pandas DataFrame.
Parameters: Returns: DataFrame representation of query supplied. Pandas data types are inferred from PostgreSQL data types. In the case of PostgreSQL date types, dates are attempted to be converted, but on failure a data type ‘object’ is used.
Return type: pandas.DataFrame
Examples
This query gets the 10 highest values from a table and returns a dataframe.
topten_df = cc.query( ''' SELECT * FROM my_table ORDER BY value_column DESC LIMIT 10 ''' )
This query joins points to polygons based on intersection, and aggregates by summing the values of the points in each polygon. The query returns a dataframe, with a geometry column that contains polygons.
points_aggregated_to_polygons = cc.query( ''' SELECT polygons.*, sum(points.values) FROM polygons JOIN points ON ST_Intersects(points.the_geom, polygons.the_geom) GROUP BY polygons.the_geom, polygons.cartodb_id ''', decode_geom=True )
-
get_default_schema
()¶
-
map
(**kwargs)¶ Produce a CARTO map visualizing data layers.
Examples
Create a map with two data
Layer
s, and oneBaseMap
layer:import cartoframes from cartoframes import Layer, BaseMap, styling cc = cartoframes.CartoContext(BASEURL, APIKEY) cc.map(layers=[BaseMap(), Layer('acadia_biodiversity', color={'column': 'simpson_index', 'scheme': styling.tealRose(7)}), Layer('peregrine_falcon_nest_sites', size='num_eggs', color={'column': 'bird_id', 'scheme': styling.vivid(10))], interactive=True)
Create a snapshot of a map at a specific zoom and center:
cc.map(layers=Layer('acadia_biodiversity', color='simpson_index'), interactive=False, zoom=14, lng=-68.3823549, lat=44.3036906)
Parameters: - layers (list, optional) –
List of zero or more of the following:
Layer
: cartoframesLayer
object for visualizing data from a CARTO table. SeeLayer
for all styling options.BaseMap
: Basemap for contextualizng data layers. SeeBaseMap
for all styling options.QueryLayer
: Layer from an arbitrary query. SeeQueryLayer
for all styling options.
- interactive (bool, optional) – Defaults to
True
to show an interactive slippy map. Setting toFalse
creates a static map. - zoom (int, optional) – Zoom level of map. Acceptable values are
usually in the range 0 to 19. 0 has the entire earth on a
single tile (256px square). Zoom 19 is the size of a city
block. Must be used in conjunction with
lng
andlat
. Defaults to a view to have all data layers in view. - lat (float, optional) – Latitude value for the center of the map.
Must be used in conjunction with
zoom
andlng
. Defaults to a view to have all data layers in view. - lng (float, optional) – Longitude value for the center of the map.
Must be used in conjunction with
zoom
andlat
. Defaults to a view to have all data layers in view. - size (tuple, optional) – List of pixel dimensions for the map.
Format is
(width, height)
. Defaults to(800, 400)
. - ax – matplotlib axis on which to draw the image. Only used when
interactive
isFalse
.
Returns: Interactive maps are rendered as HTML in an iframe, while static maps are returned as matplotlib Axes objects or IPython Image.
Return type: IPython.display.HTML or matplotlib Axes
- layers (list, optional) –
-
query
(query, table_name=None, decode_geom=False, is_select=None)¶ Pull the result from an arbitrary SQL SELECT query from a CARTO account into a pandas DataFrame. This is the default behavior, when is_select=True
Can also be used to perform database operations (creating/dropping tables, adding columns, updates, etc.). In this case, you have to explicitly specify is_select=False
This method is a helper for the CartoContext.fetch and CartoContext.execute methods. We strongly encourage you to use any of those methods depending on the type of query you want to run. If you want to get the results of a SELECT query into a pandas DataFrame, then use CartoContext.fetch. For any other query that performs an operation into the CARTO database, use CartoContext.execute
Parameters: - query (str) – Query to run against CARTO user database. This data will then be converted into a pandas DataFrame.
- table_name (str, optional) – If set (and is_select=True), this will create a new table in the user’s CARTO account that is the result of the SELECT query provided. Defaults to None (no table created).
- decode_geom (bool, optional) – Decodes CARTO’s geometries into a Shapely object that can be used, for example, in GeoPandas. It only works for SELECT queries when is_select=True
- is_select (bool, optional) – This argument has to be set depending on the query performed. True for SELECT queries, False for any other query. For the case of a SELECT SQL query (is_select=True) the result will be stored into a pandas DataFrame. When an arbitrary SQL query (is_select=False) it will perform a database operation (UPDATE, DROP, INSERT, etc.) By default is_select=None that means that the method will return a dataframe if the query starts with a select clause, otherwise it will just execute the query and return None
Returns: When is_select=True and the query is actually a SELECT query this method returns a pandas DataFrame representation of query supplied otherwise returns None. Pandas data types are inferred from PostgreSQL data types. In the case of PostgreSQL date types, dates are attempted to be converted, but on failure a data type ‘object’ is used.
Return type: pandas.DataFrame
Raises: CartoException – If there’s any error when executing the query
Examples
Query a table in CARTO and write a new table that is result of query. This query gets the 10 highest values from a table and returns a dataframe, as well as creating a new table called ‘top_ten’ in the CARTO account.
topten_df = cc.query( ''' SELECT * FROM my_table ORDER BY value_column DESC LIMIT 10 ''', table_name='top_ten' )
This query joins points to polygons based on intersection, and aggregates by summing the values of the points in each polygon. The query returns a dataframe, with a geometry column that contains polygons and also creates a new table called ‘points_aggregated_to_polygons’ in the CARTO account.
points_aggregated_to_polygons = cc.query( ''' SELECT polygons.*, sum(points.values) FROM polygons JOIN points ON ST_Intersects(points.the_geom, polygons.the_geom) GROUP BY polygons.the_geom, polygons.cartodb_id ''', table_name='points_aggregated_to_polygons', decode_geom=True )
Drops my_table
cc.query( ''' DROP TABLE my_table ''' )
Updates the column my_column in the table my_table
cc.query( ''' UPDATE my_table SET my_column = 1 ''' )
-
read
(table_name, limit=None, decode_geom=False, shared_user=None, retry_times=3)¶ - Read a table from CARTO into a pandas DataFrames. Column types are inferred from database types, to
- avoid problems with integer columns with NA or null values, they are automatically retrieved as float64
Parameters: - table_name (str) – Name of table in user’s CARTO account.
- limit (int, optional) – Read only limit lines from
table_name. Defaults to
None
, which reads the full table. - decode_geom (bool, optional) – Decodes CARTO’s geometries into a Shapely object that can be used, for example, in GeoPandas.
- shared_user (str, optional) – If a table has been shared with you, specify the user name (schema) who shared it.
- retry_times (int, optional) – If the read call is rate limited, number of retries to be made
Returns: DataFrame representation of table_name from CARTO.
Return type: pandas.DataFrame
Example
import cartoframes cc = cartoframes.CartoContext(BASEURL, APIKEY) df = cc.read('acadia_biodiversity')
-
sync
(dataframe, table_name)¶ Depending on the size of the DataFrame or CARTO table, perform granular operations on a DataFrame to only update the changed cells instead of a bulk upload. If on the large side, perform granular operations, if on the smaller side use Import API.
Note
Not yet implemented.
-
write
(df, table_name, temp_dir='/home/docs/.cache/cartoframes', overwrite=False, lnglat=None, encode_geom=False, geom_col=None, **kwargs)¶ Write a DataFrame to a CARTO table.
Examples
Write a pandas DataFrame to CARTO.
cc.write(df, 'brooklyn_poverty', overwrite=True)
Scrape an HTML table from Wikipedia and send to CARTO with content guessing to create a geometry from the country column. This uses a CARTO Import API param content_guessing parameter.
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_life_expectancy' # retrieve first HTML table from that page df = pd.read_html(url, header=0)[0] # send to carto, let it guess polygons based on the 'country' # column. Also set privacy to 'public' cc.write(df, 'life_expectancy', content_guessing=True, privacy='public') cc.map(layers=Layer('life_expectancy', color='both_sexes_life_expectancy'))
Warning
datetime64[ns] column will lose precision sending a dataframe to CARTO because postgresql has millisecond resolution while pandas does nanoseconds
Parameters: - df (pandas.DataFrame) – DataFrame to write to
table_name
in user CARTO account - table_name (str) – Table to write
df
to in CARTO. - temp_dir (str, optional) – Directory for temporary storage of data that is sent to CARTO. Defaults are defined by appdirs.
- overwrite (bool, optional) – Behavior for overwriting
table_name
if it exits on CARTO. Defaults toFalse
. - lnglat (tuple, optional) – lng/lat pair that can be used for
creating a geometry on CARTO. Defaults to
None
. In some cases, geometry will be created without specifying this. See CARTO’s Import API for more information. - encode_geom (bool, optional) – Whether to write geom_col to CARTO as the_geom.
- geom_col (str, optional) – The name of the column where geometry information is stored. Used in conjunction with encode_geom.
- **kwargs –
Keyword arguments to control write operations. Options are:
- compression to set compression for files sent to CARTO.
This will cause write speedups depending on the dataset.
Options are
None
(no compression, default) orgzip
. - Some arguments from CARTO’s Import API. See the params listed in the documentation for more information. For example, when using content_guessing=’true’, a column named ‘countries’ with country names will be used to generate polygons for each country. Another use is setting the privacy of a dataset. To avoid unintended consequences, avoid file, url, and other similar arguments.
- compression to set compression for files sent to CARTO.
This will cause write speedups depending on the dataset.
Options are
Returns: Dataset
Note
DataFrame indexes are changed to ordinary columns. CARTO creates an index called cartodb_id for every table that runs from 1 to the length of the DataFrame.
- df (pandas.DataFrame) – DataFrame to write to
-
class
cartoframes.
Credentials
(creds=None, key=None, username=None, base_url=None, cred_file=None)¶ Bases:
object
Credentials class for managing and storing user CARTO credentials. The arguments are listed in order of precedence:
Credentials
instances are first, key and base_url/username are taken next, and config_file (if given) is taken last. If no arguments are passed, then there will be an attempt to retrieve credentials from a previously saved session. One of the above scenarios needs to be met to successfully instantiate aCredentials
object.Parameters: - creds (
cartoframes.Credentials
, optional) – Credentials instance - key (str, optional) – API key of user’s CARTO account
- username (str, optional) – Username of CARTO account
- base_url (str, optional) – Base URL used for API calls. This is usually of the form https://eschbacher.carto.com/ for user eschbacher. On premises installations (and others) have a different URL pattern.
- cred_file (str, optional) – Pull credentials from a stored file. If this and all other args are not entered, Credentials will attempt to load a user config credentials file that was previously set with Credentials(…).save().
Raises: RuntimeError – If not enough credential information is passed and no stored credentials file is found, this error will be raised.
Example
from cartoframes import Credentials, CartoContext creds = Credentials(key='abcdefg', username='eschbacher') cc = CartoContext(creds=creds)
-
base_url
(base_url=None)¶ Return or set base_url.
Parameters: base_url (str, optional) – If set, updates the base_url. Otherwise returns current base_url. Note
This does not update the username attribute. Separately update the username with
Credentials.username
or update base_url and username at the same time withCredentials.set
.Example
>>> from cartoframes import Credentials # load credentials saved in previous session >>> creds = Credentials() # returns current base_url >>> creds.base_url() 'https://eschbacher.carto.com/' # updates base_url with new value >>> creds.base_url('new_base_url')
-
delete
(config_file=None)¶ Deletes the credentials file specified in config_file. If no file is specified, it deletes the default user credential file.
Parameters: config_file (str) – Path to configuration file. Defaults to delete the user default location if None. Tip
To see if there is a default user credential file stored, do the following:
>>> creds = Credentials() >>> print(creds) Credentials(username=eschbacher, key=abcdefg, base_url=https://eschbacher.carto.com/)
-
key
(key=None)¶ Return or set API key.
Parameters: key (str, optional) – If set, updates the API key, otherwise returns current API key. Example
>>> from cartoframes import Credentials # load credentials saved in previous session >>> creds = Credentials() # returns current API key >>> creds.key() 'abcdefg' # updates API key with new value >>> creds.key('new_api_key')
-
save
(config_loc=None)¶ Saves current user credentials to user directory.
Parameters: config_loc (str, optional) – Location where credentials are to be stored. If no argument is provided, it will be send to the default location. Example
from cartoframes import Credentials creds = Credentials(username='eschbacher', key='abcdefg') creds.save() # save to default location
-
set
(key=None, username=None, base_url=None)¶ Update the credentials of a Credentials instance instead with new values.
Parameters: - key (str) – API key of user account. Defaults to previous value if not specified.
- username (str) – User name of account. This parameter is optional if base_url is not specified, but defaults to the previous value if not set.
- base_url (str) – Base URL of user account. This parameter is
optional if username is specified and on CARTO’s
cloud-based account. Generally of the form
https://your_user_name.carto.com/
for cloud-based accounts. If on-prem or otherwise, contact your admin.
Example
from cartoframes import Credentials # load credentials saved in previous session creds = Credentials() # set new API key creds.set(key='new_api_key') # save new creds to default user config directory creds.save()
Note
If the username is specified but the base_url is not, the base_url will be updated to
https://<username>.carto.com/
.
-
username
(username=None)¶ Return or set username.
Parameters: username (str, optional) – If set, updates the username. Otherwise returns current username. Note
This does not update the base_url attribute. Use Credentials.set to have that updated with username.
Example
>>> from cartoframes import Credentials # load credentials saved in previous session >>> creds = Credentials() # returns current username >>> creds.username() 'eschbacher' # updates username with new value >>> creds.username('new_username')
- creds (
-
class
cartoframes.
BaseMap
(source='voyager', labels='back', only_labels=False)¶ Bases:
cartoframes.layer.AbstractLayer
Layer object for adding basemaps to a cartoframes map.
Example
Add a custom basemap to a cartoframes map.
import cartoframes from cartoframes import BaseMap, Layer cc = cartoframes.CartoContext(BASEURL, APIKEY) cc.map(layers=[BaseMap(source='light', labels='front'), Layer('acadia_biodiversity')])
Parameters: - source (str, optional) – One of
light
ordark
. Defaults tovoyager
. Basemaps come from https://carto.com/location-data-services/basemaps/ - labels (str, optional) – One of
back
,front
, or None. Labels on the front will be above the data layers. Labels on back will be underneath the data layers but on top of the basemap. Setting labels toNone
will only show the basemap. - only_labels (bool, optional) – Whether to show labels or not.
-
is_basemap
= True¶
-
is_basic
()¶ Does BaseMap pull from CARTO default basemaps?
Returns: True if using a CARTO basemap (Dark Matter, Positron or Voyager), False otherwise. Return type: bool
- source (str, optional) – One of
-
class
cartoframes.
QueryLayer
(query, time=None, color=None, size=None, opacity=None, tooltip=None, legend=None)¶ Bases:
cartoframes.layer.AbstractLayer
cartoframes data layer based on an arbitrary query to the user’s CARTO database. This layer class is useful for offloading processing to the cloud to do some of the following:
- Visualizing spatial operations using PostGIS and PostgreSQL, which is the database underlying CARTO
- Performing arbitrary relational database queries (e.g., complex JOINs in SQL instead of in pandas)
- Visualizing a subset of the data (e.g.,
SELECT * FROM table LIMIT 1000
)
Used in the layers keyword in
CartoContext.map
.Example
Underlay a QueryLayer with a complex query below a layer from a table. The QueryLayer is colored by the calculated column
abs_diff
, and points are sized by the columni_measure
.import cartoframes from cartoframes import QueryLayer, styling cc = cartoframes.CartoContext(BASEURL, APIKEY) cc.map(layers=[QueryLayer(''' WITH i_cte As ( SELECT ST_Buffer(the_geom::geography, 500)::geometry As the_geom, cartodb_id, measure, date FROM interesting_data WHERE date > '2017-04-19' ) SELECT i.cartodb_id, i.the_geom, ST_Transform(i.the_geom, 3857) AS the_geom_webmercator, abs(i.measure - j.measure) AS abs_diff, i.measure AS i_measure FROM i_cte AS i JOIN awesome_data AS j ON i.event_id = j.event_id WHERE j.measure IS NOT NULL AND j.date < '2017-04-29' ''', color={'column': 'abs_diff', 'scheme': styling.sunsetDark(7)}, size='i_measure'), Layer('fantastic_sql_table')])
Parameters: - query (str) – Query to expose data on a map layer. At a minimum, a query needs to have the columns cartodb_id, the_geom, and the_geom_webmercator for the map to display. Read more about queries in CARTO’s docs.
- time (dict or str, optional) –
Time-based style to apply to layer.
If time is a
str
, it must be the name of a column which has a data type of datetime or float.from cartoframes import QueryLayer l = QueryLayer('SELECT * FROM acadia_biodiversity', time='bird_sighting_time')
If time is a
dict
, the following keys are options:- column (str, required): Column for animating map, which must be of type datetime or float.
- method (str, optional): Type of aggregation method for
operating on Torque TileCubes. Must be one of
avg
,sum
, or another PostgreSQL aggregate functions with a numeric output. Defaults tocount
. - cumulative (
bool
, optional): Whether to accumulate points over time (True
) or not (False
, default) - frames (int, optional): Number of frames in the animation. Defaults to 256.
- duration (int, optional): Number of seconds in the animation. Defaults to 30.
- trails (int, optional): Number of trails after the incidence of a point. Defaults to 2.
from cartoframes import Layer l = Layer('acadia_biodiversity', time={ 'column': 'bird_sighting_time', 'cumulative': True, 'frames': 128, 'duration': 15 })
- color (dict or str, optional) –
Color style to apply to map. For example, this can be used to change the color of all geometries in this layer, or to create a graduated color or choropleth map.
If color is a
str
, there are two options:- A column name to style by to create, for example, a choropleth map if working with polygons. The default classification is quantiles for quantitative data and category for qualitative data.
- A hex value or web color name.
# color all geometries red (#F00) from cartoframes import Layer l = Layer('acadia_biodiversity', color='red') # color on 'num_eggs' (using defalt color scheme and quantification) l = Layer('acadia_biodiversity', color='num_eggs')
If color is a
dict
, the following keys are options, with values described:- column (str): Column used for the basis of styling
- scheme (dict, optional): Scheme such as styling.sunset(7)
from the styling module of cartoframes that
exposes CARTOColors.
Defaults to mint scheme for quantitative
data and bold for qualitative data. More control is given by
using styling.scheme.
If you wish to define a custom scheme outside of CARTOColors, it is recommended to use the styling.custom utility function.
from cartoframes import QueryLayer, styling l = QueryLayer('SELECT * FROM acadia_biodiversity', color={ 'column': 'simpson_index', 'scheme': styling.mint(7, bin_method='equal') })
- size (dict or int, optional) –
Size style to apply to point data.
If size is an
int
, all points are sized by this value.from cartoframes import QueryLayer l = QueryLayer('SELECT * FROM acadia_biodiversity', size=7)
If size is a
str
, this value is interpreted as a column, and the points are sized by the value in this column. The classification method defaults to quantiles, with a min size of 5, and a max size of 5. Use thedict
input to override these values.from cartoframes import Layer l = Layer('acadia_biodiversity', size='num_eggs')
If size is a
dict
, the follow keys are options, with values described as:- column (str): Column to base sizing of points on
- bin_method (str, optional): Quantification method for dividing
data range into bins. Must be one of the methods in
BinMethod
(excluding category). - bins (int, optional): Number of bins to break data into. Defaults to 5.
- max (int, optional): Maximum point width (in pixels). Setting this overrides range. Defaults to 25.
- min (int, optional): Minimum point width (in pixels). Setting this overrides range. Defaults to 5.
- range (tuple or list, optional): a min/max pair. Defaults to [1, 5] for lines and [5, 25] for points.
from cartoframes import Layer l = Layer('acadia_biodiversity', size={ 'column': 'num_eggs', 'max': 10, 'min': 2 })
- opacity (float, optional) – Opacity of layer from 0 to 1. Defaults to 0.9.
- tooltip (tuple, optional) – Not yet implemented.
- legend – Not yet implemented.
Raises: - CartoException – If a column name used in any of the styling options is
not in the data source in query (or table if using
Layer
). - ValueError – If styling using a
dict
and acolumn
key is not present, or if the data type for a styling option is not supported. This is also raised if styling by a geometry column (i.e.,the_geom
orthe_geom_webmercator
). Futher, this is raised if requesting a time-based map with a data source that has geometries other than points.
-
class
cartoframes.
Layer
(table_name, source=None, overwrite=False, time=None, color=None, size=None, opacity=None, tooltip=None, legend=None)¶ Bases:
cartoframes.layer.QueryLayer
A cartoframes Data Layer based on a specific table in user’s CARTO database. This layer class is used for visualizing individual datasets with
CartoContext.map
’s layers keyword argument.Example
import cartoframes from cartoframes import QueryLayer, styling cc = cartoframes.CartoContext(BASEURL, APIKEY) cc.map(layers=[Layer('fantastic_sql_table', size=7, color={'column': 'mr_fox_sightings', 'scheme': styling.prism(10)})])
Parameters: - table_name (str) – Name of table in CARTO account
- Styling – See
QueryLayer
- a full list of all arguments arguments for styling this map data (for) –
- layer. –
- source (pandas.DataFrame, optional) – Not currently implemented
- overwrite (bool, optional) – Not currently implemented
-
class
cartoframes.
BinMethod
¶ Data classification methods used for the styling of data on maps.
-
quantiles
¶ Quantiles classification for quantitative data
Type: str
-
jenks
¶ Jenks classification for quantitative data
Type: str
-
headtails
¶ Head/Tails classification for quantitative data
Type: str
-
equal
¶ Equal Interval classification for quantitative data
Type: str
-
category
¶ Category classification for qualitative data
Type: str
-
mapping
¶ The TurboCarto mappings
Type: dict
-
category
= 'category'
-
equal
= 'equal'
-
headtails
= 'headtails'
-
jenks
= 'jenks'
-
mapping
= {'category': '=', 'equal': '>', 'headtails': '<', 'jenks': '>', 'quantiles': '>'}
-
quantiles
= 'quantiles'
-
-
class
cartoframes.
Dataset
(table_name=None, schema=None, query=None, df=None, gdf=None, state=None, context=None)¶ Bases:
object
-
APPEND
= 'append'¶
-
DEFAULT_RETRY_TIMES
= 3¶
-
FAIL
= 'fail'¶
-
GEOM_TYPE_LINE
= 'line'¶
-
GEOM_TYPE_POINT
= 'point'¶
-
GEOM_TYPE_POLYGON
= 'polygon'¶
-
LINK
= 'link'¶
-
PRIVATE
= 'private'¶
-
PUBLIC
= 'public'¶
-
REPLACE
= 'replace'¶
-
STATE_LOCAL
= 'local'¶
-
STATE_REMOTE
= 'remote'¶
-
compute_geom_type
()¶ Compute the geometry type from the data
-
delete
()¶
-
download
(limit=None, decode_geom=False, retry_times=3)¶
-
exists
()¶ Checks to see if table exists
-
classmethod
from_dataframe
(df)¶
-
classmethod
from_geodataframe
(gdf)¶
-
classmethod
from_geojson
(geojson)¶
-
classmethod
from_query
(query, context=None)¶
-
classmethod
from_table
(table_name, context=None, schema=None)¶
-
get_table_column_names
(exclude=None)¶ Get column names and types from a table
-
get_table_columns
()¶ Get column names and types from a table or query result
-
upload
(with_lnglat=None, if_exists='fail', table_name=None, schema=None, context=None)¶
-