Load data¶

Base functions to load data from various sources.

`load_spec`(scan_id, spec_file[, folder])	Load data from spec file.
`load_csv`(scan_id[, folder, name_format])	Load data from the 'primary' stream from exported csv files.
`load_hdf5_data`(scan, folder[, fname_format, ...])	Wrapper that loads HDF files using h5py.
`hdf5_to_dataframe`(data)	Converts h5py object into dataframe
`load_hdf5_master`(scan, folder[, fname_format])	Wrapper that loads HDF files using h5py.
`load_databroker`(scan_id[, db, stream, ...])	Load data of the first scan with the provided scan_id.
`load_table`(scan[, source])	Automated scan table loader.
`is_Bluesky_specfile`(source[, folder])	Check if the specfile was created by Bluesky.
`db_query`(db, query)	Searches the databroker v2 database.
`show_meta`([last, scans, scans_to, db, ...])	Print metadata of scans.
`collect_meta`(scan_numbers, meta_keys[, db, ...])	Extracts metadata of a list of scans.
`lookup_position`(db, scan[, search_string, query])	Lookup positioner values in past scans.

polartools.load_data.collect_meta(scan_numbers, meta_keys, db=None, query=None)[source]¶

Extracts metadata of a list of scans.

Parameters:

scan_numberslist: List of scan number range to be displayed
dbdatabroker database: Searcheable database
meta_keysiterable: List with metadata keys to read.
querydictionary, optional: Search parameters.

Returns:

metadictionary: Metadata organized by scan number or uid (whatever is given in scans).

polartools.load_data.db_query(db, query)[source]¶

Searches the databroker v2 database.

Parameters:

db: databroker database.
query: dict: Search parameters.

Returns:

_db: Subset of db that satisfy the search parameters. Note that it has the same format as db.

See also

databroker.catalog.search()

polartools.load_data.hdf5_to_dataframe(data)[source]¶

Converts h5py object into dataframe

WARNING: it assumes a very specific format. For each item in data it will get the data in `data[“key/value”]

Parameters:

datah5py object: Object with the data. Each key needs to have a “value” subkey.

Returns:

datapandas.DataFrame: Table with the data.

See also

polartools.load_data.load_hdf5_master()
h5py.File()

polartools.load_data.is_Bluesky_specfile(source, folder='')[source]¶

Check if the specfile was created by Bluesky.

It looks for a “Bluesky” comment in the file header.

Parameters:

sourcestring or spec2nexus.spec.SpecDataFile: Either the spec file name or a SpecDataFile instance.
folderstring, optional: Folder where spec file is located.

Returns:

ybool: True if spec_file was generated by Bluesky, False otherwise.

See also

spec2nexus.spec.SpecDataFile()

polartools.load_data.load_catalog(name=None, query=None, handlers=None, tiled_path='/raw')[source]¶

Loads a databroker catalog and register data handlers.

Parameters:

namestr, optional: Name of the database. Defaults to 4-ID-D name.
querydict, optional: Dictionary with search parameters for the database.
handlersdict, optional: Dictionary organized as {handler_name: handler_class}. If None, defaults to handlers used at 4-ID-D.

Returns:

catdatabroker catalog: Catalog after running the query, and registering the handler.

polartools.load_data.load_csv(scan_id, folder='', name_format='scan_{}_primary.csv')[source]¶

Load data from the ‘primary’ stream from exported csv files.

Parameters:

scan_idint: Scan_id of the scan to be retrieved.
folderstring, optional: Folder where csv files are located.
name_formatstring, optional: General format of file name. The correct name must be retrievable through: file_name_format.format(scan_id)

Returns:

datapandas.DataFrame: Table with the data from the primary stream.

See also

pandas.read_csv()

polartools.load_data.load_databroker(scan_id, db=None, stream='primary', query=None, use_db_v1=True)[source]¶

Load data of the first scan with the provided scan_id.

Currently defaults to databroker.v1 because it is faster. See issue #28.

For further details, refer to the databroker documentation.

Parameters:

scan_idint: Scan_id of the scan to be retrieved
db: databroker database
streamstring, optional: Selects the stream from which data will be loaded.
querydict, optional: Dictionary with search parameters for the database.
use_db_v1bool, optional: Chooses databroker API version between ‘v1’ or ‘v2’, defaults to ‘v1’.

Returns:

datapandas.DataFrame: Table with the data from the primary stream.

polartools.load_data.load_hdf5_data(scan, folder, fname_format='scan_{:06d}_master.hdf', h5_location='entry/instrument/bluesky/streams/primary')[source]¶

Wrapper that loads HDF files using h5py.

Parameters:

scan_idint: Scan_id of the scan to be retrieved.
folderstring, optional: Folder where the master files are located.
fname_formatstring, optional: General format of file name. The correct name must be retrievable through: file_name_format.format(scan_id)
h5_locationstring, optional: Location of the Bluesky data stream.

Returns:

datapandas.DataFrame: Table with the data from the primary stream.

See also

polartools.load_data.load_hdf5_master()
polartools.load_data.hdf5_to_dataframe()
h5py.File()

polartools.load_data.load_hdf5_master(scan, folder, fname_format='scan_{:06d}_master.hdf')[source]¶

Wrapper that loads HDF files using h5py.

Parameters:

scan_idint: Scan_id of the scan to be retrieved.
folderstring, optional: Folder where the master files are located.
fname_formatstring, optional: General format of file name. The correct name must be retrievable through: file_name_format.format(scan_id)

Returns:

datah5py.File: Loaded HDF file.

See also

h5py.File()

polartools.load_data.load_spec(scan_id, spec_file, folder='')[source]¶

Load data from spec file.

If spec_file is the file name, it will load the spec file internally which is time consuming.

Parameters:

scan_idint: Scan_id of the scan to be retrieved.
spec_filestring or spec2nexus.spec.SpecDataFile: Either the spec file name or a SpecDataFile instance.
folderstring, optional: Folder where spec file is located.

Returns:

datapandas.DataFrame: Table with the data from scan.

See also

spec2nexus.spec.SpecDataFile()

polartools.load_data.load_table(scan, source=None, **kwargs)[source]¶

Automated scan table loader.

The automation is based on the source argument.

if source == ‘csv’ -> uses load_csv.
else if source is a string or nexus2spec.spec.SpecDataFile -> uses load_spec.
else -> uses load_databroker.

Parameters:

scanint

Scan_id our uid. If scan_id is passed, it will load the last scan with that scan_id. See kwargs for search options.

sourcedatabroker database, name of the spec file, or ‘csv’

Note that applicable kwargs depend on this selection.

kwargs

The necessary kwargs are passed to the loading functions defined by the source argument:

csv -> possible kwargs: folder, name_format.
spec -> possible kwargs: folder.
databroker -> possible kwargs: stream, query, use_db_v1.

Note that a warning will be printed if the an unnecessary kwarg is passed.

Returns:

tablepandas.DataFrame: Table with the scan data.

See also

polartools.load_data.load_databroker()
polartools.load_data.load_csv()
polartools.load_data.load_spec()

polartools.load_data.lookup_position(db, scan, search_string='', query=None)[source]¶

Lookup positioner values in past scans.

Parameters:

dbdatabroker database: Searcheable database
scaninteger: Scan numbers or uids.
search_stringstring: Full or part of positioner name.
query: dict: Search parameters.

Returns:

output: list

polartools.load_data.show_meta(last=None, scans=None, scans_to=None, db=None, query=None, meta_keys='short', table_fmt='plain')[source]¶

Print metadata of scans.

Parameters:

lastint: last number of scans to be displayed
scansint, list: List of scan numbers to process.
scans_toint, list: Final scan number to process. Note that this is only meaningful if an integer is passed to scans.
dbdatabroker database (optional): Searcheable database
querydictionary, optional: Search parameters.
meta_keysstring or iterable, optional: List with metadata keys to read. There are two preset metadata lists that can be used with meta_keys=”short” or meta_keys=”long”.