> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mangrovesystems.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Bulk import data

> In the Data Inputs section, bulk import data through a project-specific Data Source.

## Importing data

<Steps>
  <Step title="Prepare your data">
    Prepare the raw data file that you need to bring into your project in Mangrove. Acceptable file formats include CSV, XLS, and XLSX.

    * Ensure you're using the right import template
    * Populate all required columns in the template
      <Note>To streamline imports, ensure that data in the import templates reflect the correct formatting (e.g., date format or datetime format) and units (e.g., values in (%) should be between 0 to 100). A best practice is to “paste as value only” into the import templates to avoid copying external links and formulae into the template.</Note>
  </Step>

  <Step title="Upload the file">
    Upload the file into the corresponding Data Source in the Mangrove platform

    * Orders recorded for this customer will have the currency set as the default
    * Save contacts, and important information about the customer's registry accounts
  </Step>

  <Step title="Review the loaded events">
    Events are transformed from your bulk import and populate the Analytics charts and Events Feed.

    * View event data: by selecting each event on the feed
    * Attach evidence files to relevant events: with **Upload Evidence** on each event
    * Filtering events: You can filter for specific event types and time ranges in the feed and on the Analytics charts

    To check the status of ongoing bulk imports, visit the Data Inputs > Bulk Jobs section. If you encounter an error that you need assistance with, please reach out to Mangrove through your shared channel.

    * Reviewing issues with imports: Issues with bulk imports would be reflected as `errored` bulk jobs. Review the error message for more detail on the specific issue transforming the data in your bulk import file into Mangrove.
    * Cancel or Reverse an import: Mistakenly import wrong data? A completed bulk job can be **Reversed**. A bulk job that has already started can be **Canceled** - any existing data transformed from your import file will be deleted.
    * **Retry** an import
  </Step>
</Steps>

## Data transformations

Transformations can be written by Admin users to be applied to every new file bulk imported from that data source.

Transformations can be applied to bulk import files to generate events and evidences from them.

### Editing transformations on each Data Source

Admin users can define the event types that are generated from every Data Source through the Transformations Editor.

* In **Data Inputs > Input Settings**, select an existing Data Source
* **Add transformation**
* Select an event type for data from the Data Source
* Edit the Python transformation in the editor

Transformations are written in Python, and need to return a `results` array of objects representing the events to generate from the import data. An example is shown below:

```python Transformation theme={null}
from datetime import datetime

results = []
for row in rows:
	event_object = {
		"event_type": "<event slug>",
		"notes": row["<notes column>"],
		"start_time": parse_datetime(row["<start time column>"] row["<timezone column>"]).isoformat(),
		"end_time": parse_datetime(row["<end time column>"] row["<timezone column>"]).isoformat(),
		"locations": [
			{
				"name": row["<location name column>"],
				"lat": 43.6499286,
				"long": -79.3858228
			}
		],
		"feedstock": {
            "name": "Dairy - Krol Farms",
            "feedstock_type": "Animal Waste"
        },
		"evidences": [
			{
				"name": "<name for evidence file>",
				"ref_id": "evidence_1",
				"type": "json",
				"content": {
					"some_key": "some_content"
				}
			}
		],
		"data_points": [
			{
				"slug": "<datapoint 1 slug>",
				"value": row["<datapoint1 column>"],
				"evidence_refs": ["evidence_1"]
			},
			{
				"slug": "<datapoint 2 slug>",
				"value": row["<datapoint2 column>"]
			}
		]
	}
	results.append(event_object)
```

Here’s an example of what the `results` array might look like following execution of the transformation above:

```json results theme={null}
[
	{
		'event_type': '<event slug>', 
		'notes': 'This is an example event note',
		'start_time': '2024-02-01 12:00:00-04:00',
		'end_time': '2024-02-01 15:30:00-04:00',
		'locations': [
			{
				"name": "<location name>",
				"lat": 43.6499286,
				"long": -79.3858228
			}
		],
		"feedstock": {
            "name": "Dairy - Krol Farms",
            "feedstock_type": "Animal Waste"
        },
		'evidences': [
			{
				'name': '<name for evidence file>',
				'ref_id': 'evidence_1',
				'type': 'json',
				'content': {
					'some_key': 'some_content'
				}
			}
		],
		'data_points': [
			{
				'slug': '<datapoint 1 slug>',
				'value': '<datapoint 1 value>',
				'evidence_refs': ['evidence_1']
			},
			{
				'slug': '<datapoint 2 slug>',
				'value': '<datapoint2 value>'
			}
		]
	}
	{
	  # another event object
	  ...
	}
]
```

Mangrove will process this result to create events in the `Data Inputs > Events` feed that can be used to run production models.

#### Transformed Event Attributes

Below is the full list of fields that can be defined for each event object transformed from the import data:

<AccordionGroup>
  <Accordion title="Props" defaultOpen={true}>
    <ParamField type="string" body="start_time" required>
      A timestamp in the format `%Y-%m-%d %H:%M:%S%z`  (ex. `2024-02-01 12:00:00-04:00` ), where timezone is specified as a `+/-` offset.
    </ParamField>

    <ParamField type="string" body="end_time" required>
      A timestamp in the format `%Y-%m-%d %H:%M:%S%z`  (ex. `2024-02-01 12:00:00-04:00` ), where timezone is specified as a `+/-` offset.
    </ParamField>

    <ParamField type="object[]" body="datapoints" required>
      An array of datapoints, each specifying a `slug` identifier for the corresponding datapoint type, and a `value` matching the expected `value_type` for that datapoint type. Optionally include `evidence_refs` — an array of `ref_id` strings referencing evidences to attach to that specific data point.
    </ParamField>

    <ParamField type="string" body="tracking_id" />

    <ParamField type="object[]" body="locations">
      An array of objects representing either existing locations or new locations to be created. The system will attempt to match with existing locations using the `name` attribute. If it cannot find an existing location, it will create a new one using the `name`, `lat`, and `long` attributes provided.
    </ParamField>

    <ParamField type="object" body="feedstock">
      An object representing an existing feedstock in the project. Requires `name` and `feedstock_type` fields.
    </ParamField>

    <ParamField type="object[]" body="evidences">
      An array of objects representing text-based evidence files to attach to the event. Each object must define a `name` for the file (without file extension), a file `type` (supported: `json | geojson | text | csv`), and file `content` as a string. Optionally include a `ref_id` string to link the evidence to specific data points via `evidence_refs`. If no `ref_id` is provided, the evidence is attached to all data points on the event.
    </ParamField>

    <ParamField type="string" body="notes" />
  </Accordion>
</AccordionGroup>

#### Using external packages in transformations

Apart from the Python standard library, Mangrove also supports a curated list of external packages that you can import and use in transformations:

* `mapbox`
* `boto3`
* `geopy`
* `pandas`
* `numpy`
