Firstly, locate the file definitions page from dropdown menu at the top of the application.
The following steps detailed in this documentation require you to at least have BI Developer permissions.
Clicking the “Add File Definition” button will take you to the file definitions form, this is a single page form that requires a few fields.
If you don’t completely understand the purpose of file definitions, the File Definitions Concepts page provides an overview of what file definitions do and how they are configured.
Each file definition requires a name, this acts as a unique label that can easily identify what the file definition is used for.
File definitions require a type to state what the definition is being used for.
Loome Integrate currently supports 3 different types of File Definitions.
To learn more about these types, visit the File Definitions Concepts page.
This is the path to the folder which will be used for the file definition. The format of this depends on the type of file definition.
File Systems require the full path to the folder that will contain the
files you’re working with. Note that Loome Integrate supports both Linux
and Windows style paths (eg: both C:\CSVData
and /CSVData
are
the same).
Loome Integrate accepts the same paths you would use in a program like File Explorer so copying the path from the address bar is an easy way to get the path.
The path for an Azure Blob file definition is just the name of the folder
you wish to work in. For example if you had a container
that had a folder called Data
than your path for the definition would just be Data
.
As is the case wth File System paths, this value is just the full path to the folder you wish to use within the HDFS/DBFS.
There is an optional way of selecting path using File Browser. This will open the file browser after you select the connection and the agent associated with the file system. You can navigate through the browser at the folder level and select the folder containing the file/s.
Selecting path using File Browser is recommended as it avoids any possible mistakes while entering path manually. However, this option is currently available only to browse git repositories.
After selecting the agent and the connection, clicking on next button will load a File Browser. It might take a while to load all the contents from the git repository. You can then navigate within the browser to select any folder.
The file Browser is displaying all the folders in the repository. You can navigate to any folder by clicking on the folder name. On clicking the folder name will take you to that folder and displays all the files/folders within that folder.
You can see the path selected in the path section at the top of the File Browser as highlighted in the above image. Once you have selected the right path, click on submit.
You have now selected the path from the File Browser and the selected path is displayed in the path section as highlighted in the above image.
The file format field determines how Loome Integrate reads and writes to files.
In most cases, you would use the “Delimited” format as standard flat file types.
File Format | Description |
---|---|
Delimited | The files being processed are to be delimited using a human readable character or set of characters. |
Hex Delimited | The files being processed use a hexadecimal based delimiter. |
This is the character that is used to split cells. In most cases this will
be a single character like a comma (.csv
) however if you need to use
whitespace based characters like a tab (.tsv
) you can use the delimiter
dropdown.
This is the file encoding that is used for reading and writing to the files associated with this file definition.
If you are unsure about what to use, select “UTF-8” as it supports the widest range of characters and languages.
This is the file extension to save and retrieve files with. Common
examples for this include csv
, dat
and txt
but it ultimately depends
on your requirements.
If you are using this File Definition as a migration target and are
unsure as to what extension to use, it is recommended to use csv
as you
can easily view the contents of the file with Microsoft Excel.
Loome Integrate Online supports Apache Parquet as a migration target out of the box. This means that if a target connection utilizes a file definition with the file format “Parquet” Loome Integrate will automatically output the data to a parquet file rather than a flat file.
Note that when using Parquet files you are limited on additional customisation options such as the delimiter and encoding as these factors do not affect the creation of Parquet files.
If your files will need to have a header row (used for displaying the column names) then you should check header as Loome Integrate will factor this in with migrations to and from the file definition.
After creating or editing a file definition, it is highly recommended you validate it with an agent. This is as easy as clicking the tick button next to the file definition and selecting the agent that will be used to validate it.
Loome Integrate will ensure that the file path is valid, that it can access the files in the file definition, that the files have read privileges, and the agent can work with the provided definition.