Documentation

Databricks

Updated on

Jan 22, 2024

IMPORTANT: This article covers setup of warehouse for load data from Improvado, not customer data warehouse from which data is being extracted. This article doesn't cover setup of customer data warehouse for Data Prep as well.

Required information

Server hostname

  • Azure Databricks - ```https://adb-ХХХХ.ХХ.azuredatabricks.net```
  • AWS Databricks - ```https://dbc-ХХХХ.cloud.databricks.com```
  • Google Cloud Databricks - ```https://XXXX.X.gcs.databricks.com```

Filepath

Possible parameters:

```/FileStore/{{filename}}-{{YYYY}}-{{MM}}-{{DD}}-{{timestamp}}```

  • ```{{filename}}``` is the same as destination table name
  • ```{{timestamp}}``` is the date and time when data load started

IMPORTANT: you cannot use {{DD}} for partition by month

  • ```{{filename}}-{{YYYY}}-{{MM}}-{{DD}}``` – for partition by day
  • ```{{filename}}-{{YYYY}}-{{MM}}``` – for partition by month

Also, you can use “_” instead of “-” or do not use any symbols at all, for example:

  • ```{{filename}}_{{YYYY}}-{{MM}}-{{DD}}-{{timestamp}}```
  • ```{{filename}}{{YYYY}}{{MM}}{{DD}}{{timestamp}}```

Partition by

Possible ways of splitting data:

  • Day
  • Month

File format

Possible formats:

  • csv
  • csv+gzip
  • json
  • json+gzip
  • parquet

Separator

Possible delimiters that can separate data in your file:

  • comma
  • semicolon
  • tab

Schema information

Setup guide

Settings

No items found.

Troubleshooting

Troubleshooting guides

Check out troubleshooting guides for
Databricks
here

Limits

Frequently asked questions

No items found.
☶ On this page
Description
Related articles
No items found.
No items found.

Questions?

Improvado team is always happy to help with any other questions you might have! Send us an email.

Contact your Customer Success Manager or raise a request in Improvado Service Desk.