Datasets API#
Handle#
get_dataset_api#
Project.get_dataset_api()
Get the dataset api for the project.
Returns
DatasetApi: The Datasets Api handle
Methods#
copy#
DatasetApi.copy(source_path, destination_path, overwrite=False)
Copy a file or directory in the Hopsworks Filesystem.
import hopsworks
project = hopsworks.login()
dataset_api = project.get_dataset_api()
directory_path = dataset_api.copy("Resources/myfile.txt", "Logs/myfile.txt")
- source_path 
str: the source path to copy - destination_path 
str: the destination path - overwrite 
bool: overwrite destination if exists 
Raises
RestAPIError: If unable to perform the copy
download#
DatasetApi.download(path, local_path=None, overwrite=False)
Download file from Hopsworks Filesystem to the current working directory.
import hopsworks
project = hopsworks.login()
dataset_api = project.get_dataset_api()
downloaded_file_path = dataset_api.download("Resources/my_local_file.txt")
- path 
str: path in Hopsworks filesystem to the file - local_path 
Optional[str]: path where to download the file in the local filesystem - overwrite 
bool: overwrite local file if exists 
Returns
str: Path to downloaded file
Raises
RestAPIError: If unable to download the file
exists#
DatasetApi.exists(path)
Check if a file exists in the Hopsworks Filesystem.
Arguments
- path 
str: path to check 
Returns
bool: True if exists, otherwise False
Raises
RestAPIError: If unable to check existence for the path
mkdir#
DatasetApi.mkdir(path)
Create a directory in the Hopsworks Filesystem.
import hopsworks
project = hopsworks.login()
dataset_api = project.get_dataset_api()
directory_path = dataset_api.mkdir("Resources/my_dir")
- path 
str: path to directory 
Returns
str: Path to created directory
Raises
RestAPIError: If unable to create the directory
move#
DatasetApi.move(source_path, destination_path, overwrite=False)
Move a file or directory in the Hopsworks Filesystem.
import hopsworks
project = hopsworks.login()
dataset_api = project.get_dataset_api()
directory_path = dataset_api.move("Resources/myfile.txt", "Logs/myfile.txt")
- source_path 
str: the source path to move - destination_path 
str: the destination path - overwrite 
bool: overwrite destination if exists 
Raises
RestAPIError: If unable to perform the move
remove#
DatasetApi.remove(path)
Remove a path in the Hopsworks Filesystem.
Arguments
- path 
str: path to remove 
Raises
RestAPIError: If unable to remove the path
upload#
DatasetApi.upload(
    local_path,
    upload_path,
    overwrite=False,
    chunk_size=1048576,
    simultaneous_uploads=3,
    max_chunk_retries=1,
    chunk_retry_interval=1,
)
Upload a file to the Hopsworks filesystem.
import hopsworks
project = hopsworks.login()
dataset_api = project.get_dataset_api()
uploaded_file_path = dataset_api.upload("my_local_file.txt", "Resources")
- local_path 
str: local path to file to upload - upload_path 
str: path to directory where to upload the file in Hopsworks Filesystem - overwrite 
bool: overwrite file if exists - chunk_size: upload chunk size in bytes. Default 1048576 bytes
 - simultaneous_uploads: number of simultaneous chunks to upload. Default 3
 - max_chunk_retries: maximum retry for a chunk. Default is 1
 - chunk_retry_interval: chunk retry interval in seconds. Default is 1sec
 
Returns
str: Path to uploaded file
Raises
RestAPIError: If unable to upload the file