API Guy: Uploading a File to a Connector

Dec 15, 2022

Rick Ehrhart

API Evangelist

Share with Your Network

API Guy: Uploading a File to a Connector

Greetings. Let’s jump right to it. This blog will discuss:

How to upload a file to a connector
A new Python CSV to Kenna Data Importer (KDI) JSON

Uploading a File to a Connector

As you might recall from the previous blog, Automating Connector Runs, in section “Obtaining connector runs”, uploading files to the connector was skipped due to the focus on host based connectors. Now there is blog_auto_run_connectors.py which started from connectors_auto_start.py. (I didn’t want to change connectors_auto_start.py since the earlier blog depended on the line numbers in the code.) Let’s take a look at the additional code for uploading a file.

Verifying a File

Since there is no database to specify the file name to upload, the script creates a file name from the connector name.

 50 # Forge a file name from the connector name, and verify it is a file.
 51 def forge_and_verify_file_name(file_name):
 52     file_ok = False
 53 
 54     file_name = file_name.replace(" ", "_") + ".json"
 55     file_name = os.path.abspath(file_name)
 56     if os.path.isfile(file_name):
 57         file_ok = True
 58 
 59     return (file_ok, file_name)
 60

Any spaces in the connector name is replaced with underscores, and the file suffix .json is added in line 54. In line 55 an absolute file path is created. An absolute file path is required for upload by the API. The file existence is verified in line 56. Fortunately, an existential crisis is not created.

Uploading a File

Now to crux. The “Upload Data File” API is used with the connector ID in the URL. An absolute file name path and run Boolean are in the body parameters.

 61 # Upload a file and run the connector.
 62 def upload_and_run_file_connector(base_url, headers, connector_id, connector_name, upload_file_name):
 63     upload_file_url = f"{base_url}connectors/{connector_id}/data_file"
 64 
 65     # Remove Content-Type or it won't work.
 66     upload_headers = headers
 67     upload_headers.pop("Content-Type")
 68 
 69     try:
 70         upload_f = open(upload_file_name, 'rb')
 71     except FileNotFoundError:
 72         print(f"File {upload_file_name} should exist!")
 73         sys.exit(1)
 74 
 75     files = {
 76         'file': (upload_file_name, upload_f, 'application/json')
 77     }
 78 
 79     payload = {
 80         'run': True
 81     }
 82 
 83     response = requests.post(upload_file_url, headers=upload_headers, data=payload, files=files)
 84     if response.status_code != 200:
 85         print(f"Upload File Connector Error: {response.status_code} for {connector_name} with {upload_file_url}")
 86         sys.exit(1)
 87 
 88     resp_json = response.json()
 89     if not resp_json['success']:
 90         print(f"Uploading {upload_file_name} for {connector_name} ({connector_id}) failed.  Check log files.")
 91         sys.exit(1)
 92 
 93     return resp_json['connector_run_id']
 94

In line 63, the connector_id is added to the URL along with "data_file".

For “Upload Data File” API, Content-Type in the HTTP header must not exist. Line 67 removes Content-Type. Next in line 70, the file to upload is opened in read-only and binary mode.

Uploading the file using the Python requests library requires the use of the files parameter. I personally found the requests library documentation a little weak for “how to use the files parameter”, but there are many helpful websites out there; for example, Tutorialspoint and StackOverflow.

The data parameter is used to pass the run Boolean. In this case, run is set to True which indicates to run the connector after uploading the file.

Line 83 puts it altogether in the requests POST call. On success, the connector_run_id in the API response is returned in line 93.

Logic Changes for File Connectors

Starting at line 198 is the additional logic for file connectors.

198         # Check if connector is file based.
199         if connector['host'] is None:
200             (file_ok, upload_file_name) = forge_and_verify_file_name(name)
201             if not file_ok:
202                 conn_tbl.add_row([name, f"file based connector expecting {upload_file_name}"])
203                 continue
204 
205             # Run the connector if the file is younger the last connection run.
206             file_mod_time = os.path.getmtime(upload_file_name)
207             file_mod_datetime = datetime.fromtimestamp(file_mod_time)
208             if file_mod_datetime > end_datetime:
209                 connector_run_id = upload_and_run_file_connector(base_url, headers, id, name, upload_file_name)
210                 conn_tbl.add_row([name, f"{upload_file_name} uploaded, and launched connector run {connector_run_id}."])
211                 continue
212             else:
213                 conn_tbl.add_row([name, f"{upload_file_name} has not been modified since last connector run"])
214                 continue
215

At line 199, if the connector doesn’t have an associated host, then it is a file connector. As you might recall, this is where the previous blog bailed. But now, file based connectors are processed. There are three states:

The file to upload doesn’t exist. (lines 201 – 203)
The file is uploaded and a connector run is launched (lines 208 – 211)
The upload file has not been modified since last connector run (lines 208, 213-214)

The upload files’ modification time is obtained in lines 206 and 207. If the upload file’s modification time is after the last connector run, then call upload_and_run_file_connector().

Python Upload File Summary

The critical portions of the code have been covered for file upload. There are some other changes, like now there is a help and there is a -f option to force running a connector even it hasn’t been run manually for the first time. Also some error message enhancements have been added. If you’re curious what the difference is between blog_auto_run_connectors.py and connectors_auto_start.py, I suggest doing a visual file diff.

Finally, show_connector_status.py was updated to show file connector status.

Curl Upload File

I know some of you out there enjoy writing bash scripts with curl, so here is a simple curl example:

  1 #!/bin/bash
  2 
  3 if (( $# == 0 ))
  4 then
  5     echo "Requires a KDI JSON file"
  6     echo "upload_run_conn "
  7     exit
  8 fi
  9 
 10 upload_file=$1
 11 
 12 echo "Uploading ${upload_file}"
 13 
 14 curl --request POST \
 15      --url https://yf15.kennasecurity.com/connectors/961696/data_file \
 16      --header 'X-Risk-Token: yCQEJPlanck-6626x10-34tdmKni39JzVYx2RB3zO2' \ 
 17      --header 'accept: application/json' \
 18      --header 'content-type: multipart/form-data' \
 19      --form file=@${upload_file} \
 20      --form run=true
 21

The connector is preset in the URL with the connector ID 961696. (Line 15), but the file to upload is a parameter (line 10). Curl uses the formparameter with file= for the file upload (line 19). Notice the required @ to declare a file name and the file name does not have to be an absolute path.

Python CSV to KDI JSON

Due to customer requests, the CSV to KDI JSON code has been written in Python. It uses the same meta map file to map CSV columns to KDI JSON. The source, csv_to_kdi.py is located in the All_Samples repository in the KDI Importer directory along side csv_KDI_json.rb. Even though the two scripts accomplish the same thing, there are some differences.

csv_to_kdi.py has a help option, python csv_to_kdi.py --help
The has_header option was removed in csv_to_kdi.py because we couldn’t find a anyone setting it to false.
All but one parameter is optional in csv_to_kdi.py. The only positional parameter is the input file name. The help output lists the other parameters with their defaults.

Both scripts have the following options:

assets_only – If set, only maps asset data. If not set, maps assets and vulnerabilities
domain_suffix – If set, then the suffix . is added to the hostname from the CSV file. If not set, then just the hostname is used.
skip_autoclose – If set, vulnerabilities are not automatically closed.

The README.md has been updated to reflect the new Python script.

Testing

The tests directory contains tests that compare the output between csv_to_kdi.py and csv_KDI_json.rb. So far, there are six different tests. An execute_test script was created to run one test and diff the results.

A Python script, diff_json.py was created to diff the KDI JSON output between the ruby and python scripts. Unfortunately, the script only informs that there is a difference. You will have to use your favorite diff tool to discover the differences.

Conclusion

So now you should have all the tools to convert CSV to KDI JSON and upload the JSON file to a connector. Things to add would be streaming or upload multi-part files for large datasets. Maybe another blog sometime.

Rick Ehrhart
API Evangelist

dev@kennasecurity.com