Upload bulk JSON data to ElasticSearch using Python

- Pranay on Jun 14 '18
- 1727 views

This post describes how to perform bulk actions to ElasticSearch using Python ElasticSearch Client - Bulk helpers.

Setting up ElasticSearch and Python

It is assumed that you already have setup ElasticSearch and have a Python environment ready along with some IDE. If not yet done then go through this post for ElasticSearch and Python environment setup - Setting up and getting started with ElasticSearch using Kibana & Python

Elastic search

Python ElasticSearch Client

This requires to install Python Elasticsearch Client mentioned here - Python Elasticsearch Client Installation or just run the below command from your Python console.

pip install elasticsearch

Bulk upload to ElasticSearch using Python code

These are the steps followed to achieve this

  1. Load the .json file to Python's File object
  2. Load the data from file as Python's JSON object
  3. Upload this json object using bulk helper function. Here is a detailed documentation on the syntax of bulk helper function

Below is the Python script to upload bulk data from .JSON file to ElasticSearch

import sys
import json
from pprint import pprint
from elasticsearch import Elasticsearch
es = Elasticsearch(
    ['localhost'],
    port=9200

)

MyFile= open("C:\ElasticSearch\shakespeare_6.0.json",'r').read()
ClearData = MyFile.splitlines(True)
i=0
json_str=""
docs ={}
for line in ClearData:
    line = ''.join(line.split())
    if line != "},":
        json_str = json_str+line
    else:
        docs[i]=json_str+"}"
        json_str=""
        print(docs[i])
        es.index(index='shakespeare', doc_type='Blog', id=i, body=docs[i])
        i=i+1

Screenshot: Output of the command running in Python

We can check the uploaded data using the below Python code.

es = Elasticsearch(
    ['localhost'],
    port=9200
)
es = Elasticsearch(ES_CLUSTER)
with open("C:\ElasticSearch\shakespeare_6.0.json") as json_file:
    json_docs = json.load(json_file)
es.bulk(ES_INDEX, ES_TYPE, json_docs)

Screenshot: Output of the command running in Python

It can also be verified from Kibana Dev console (if Kibana is already installed)

Kibana GET command

Screenshot: With Kibana GET command and output in the right side

I hope this post might have helped you. Please comment and let me know your thoughts!!

About the author

Pranay Deegoju
Sitecore Certified Professional

A Software Engineer by profession, a part time blogger and an enthusiast programmer. You can find more about me here.



Leave your comments on this post here