> ## Documentation Index
> Fetch the complete documentation index at: https://knowledge.cloudquant.com/llms.txt
> Use this file to discover all available pages before exploring further.

# SFTP

> Configure SFTP datasources

# SFTP

SFTP (SSH File Transfer Protocol) datasources allow CloudQuant Data Liberator to read CSV, TSV, and other delimited files from remote servers over an encrypted SSH connection. CloudQuant Data Liberator mounts the remote directory via SSHFS/FUSE.

<Note>
  See [Supported Data Formats](/datasource-config/supported-formats) for every file extension Liberator can ingest over SFTP, including formats added in 2.1 and 2.2.
</Note>

## Connection configuration

### Required fields

| Field             | Type   | Description                        |
| ----------------- | ------ | ---------------------------------- |
| `connection_type` | string | Must be `"sftp"`                   |
| `host`            | string | SFTP server hostname or IP address |
| `user`            | string | Username for authentication        |

<Note>
  You must provide either `password` or `key` for authentication. If both are specified, key-based authentication takes precedence.
</Note>

### Optional fields

| Field         | Type   | Default | Description                                                       |
| ------------- | ------ | ------- | ----------------------------------------------------------------- |
| `port`        | int    | `22`    | SSH port number                                                   |
| `password`    | string |         | Password for password-based authentication                        |
| `key`         | string |         | SSH private key content (PEM format) for key-based authentication |
| `prefix`      | string | `""`    | Remote directory path to use as root                              |
| `mount_point` | string |         | Local mount path for FUSE-based access                            |
| `config_name` | string |         | Internal configuration identifier                                 |

### Example connection (password authentication)

```json theme={null}
{
  "name": "sftp-vendor-data",
  "connection_type": "sftp",
  "host": "sftp.vendor.example.com",
  "port": 22,
  "user": "datauser",
  "password": "s3cur3P@ssw0rd",
  "prefix": "/data/daily-feeds/"
}
```

### Example connection (key authentication)

```json theme={null}
{
  "name": "sftp-internal-data",
  "connection_type": "sftp",
  "host": "data-server.internal.net",
  "port": 2222,
  "user": "liberator-svc",
  "key": "-----BEGIN OPENSSH PRIVATE KEY-----\nb3BlbnNza...\n-----END OPENSSH PRIVATE KEY-----",
  "prefix": "/exports/market-data/"
}
```

<Warning>
  Avoid embedding private keys or passwords directly in configuration files. Use environment variables or a secrets manager to inject credentials at deployment time.
</Warning>

## Dataset configuration (data\_args)

The `data_args` fields are identical to all file-based sources. See [Local File](/datasource-config/local-file) for the full reference. The `file_pattern` is evaluated relative to the `prefix` configured on the connection.

### Required data\_args

| Field             | Type           | Description                                                         |
| ----------------- | -------------- | ------------------------------------------------------------------- |
| `file_pattern`    | string         | Glob pattern relative to the prefix, e.g., `"*.csv"`                |
| `data_dt_column`  | string or list | Column(s) containing the datetime value                             |
| `data_dt_format`  | string or list | strptime format or special values (`"muts"`, `"uts"`, `"datetime"`) |
| `data_key_column` | string or list | Symbol/key column(s)                                                |

### Optional data\_args

| Field                     | Type   | Default              | Description                              |
| ------------------------- | ------ | -------------------- | ---------------------------------------- |
| `sep_override`            | string | `","`                | Delimiter character                      |
| `encoding`                | string | `"utf-8"`            | File encoding                            |
| `data_dt_timezone`        | string | `"UTC"`              | Source data timezone                     |
| `fname_dt_regex`          | string |                      | Regex to extract date from filename      |
| `fname_dt_format`         | string |                      | strptime format for filename date        |
| `fname_dt_timezone`       | string |                      | Timezone of filename date                |
| `fname_dt_nudge`          | int    | `0`                  | Microsecond offset for filename date     |
| `fname_dt_approx_seconds` | int    |                      | Approximate seconds per file             |
| `arrow_sort`              | list   | `["symbol", "muts"]` | Sort order                               |
| `arrow_timestamp`         | bool   | `true`               | Generate human-readable timestamp column |

## Complete example

### Connection

```json theme={null}
{
  "name": "sftp-trades-feed",
  "connection_type": "sftp",
  "host": "sftp.dataprovider.com",
  "port": 22,
  "user": "cq-ingest",
  "password": "vendorPassword123",
  "prefix": "/feeds/trades/"
}
```

### Dataset

```json theme={null}
{
  "name": "vendor-trades",
  "connection": "sftp-trades-feed",
  "data_args": {
    "file_pattern": "trades_*.csv.gz",
    "sep_override": ",",
    "encoding": "utf-8",
    "data_dt_column": ["date", "time"],
    "data_dt_format": ["%Y%m%d", "%H:%M:%S.%f"],
    "data_dt_timezone": "America/New_York",
    "data_key_column": "symbol",
    "fname_dt_regex": "trades_(\\d{8})\\.csv\\.gz",
    "fname_dt_format": "%Y%m%d",
    "fname_dt_timezone": "America/New_York",
    "fname_dt_approx_seconds": 86400,
    "arrow_sort": ["symbol", "muts"],
    "arrow_timestamp": true
  },
  "schema": [
    { "name": "symbol", "type": "string", "group": "key", "description": "Ticker symbol" },
    { "name": "date", "type": "string", "group": "time", "description": "Trade date" },
    { "name": "time", "type": "string", "group": "time", "description": "Trade time" },
    { "name": "price", "type": "double", "group": "value", "description": "Trade price" },
    { "name": "size", "type": "int64", "group": "value", "description": "Trade size" },
    { "name": "condition", "type": "string", "group": "value", "description": "Sale condition code" }
  ]
}
```

<Tip>
  CloudQuant Data Liberator supports reading gzip-compressed files (`.csv.gz`) transparently. Use compressed files on SFTP connections to reduce transfer time over slow or high-latency links.
</Tip>

## Network requirements

Ensure the following network connectivity from the CloudQuant Data Liberator host:

| Requirement        | Detail                                                             |
| ------------------ | ------------------------------------------------------------------ |
| **Outbound port**  | TCP port 22 (or custom port) to the SFTP server                    |
| **DNS resolution** | The hostname must resolve from the CloudQuant Data Liberator host  |
| **Firewall rules** | Whitelist the CloudQuant Data Liberator host IP on the SFTP server |
| **SSH host key**   | The server's host key must be trusted (added to known\_hosts)      |
