Upload bulk JSON data to ElasticSearch using Python

Pranay
May 16, 2020  ยท  35286 views

This post describes how to perform bulk actions to ElasticSearch using Python ElasticSearch Client - Bulk helpers.

Setting up ElasticSearch and Python

It is assumed that you already have setup ElasticSearch and have a Python environment ready along with some IDE. If not yet done then go through this post for ElasticSearch and Python environment setup - Setting up and getting started with ElasticSearch using Kibana & Python

Elastic search

Python ElasticSearch Client

This requires to install Python Elasticsearch Client mentioned here - Python Elasticsearch Client Installation or just run the below command from your Python console.

pip install elasticsearch

Uploading bulk data from JSON file to ElasticSearch using Python code###

Below are the steps I followed to achieve this

  1. Load the .json file to Python's File object
  2. Load the data from file as Python's JSON object
  3. Upload this json object using bulk helper function. Here is a detailed documentation on the syntax of bulk helper function

Below is the Python script to upload bulk data from .JSON file to ElasticSearch

import sys
import json
from pprint import pprint
from elasticsearch import Elasticsearch
es = Elasticsearch(
    ['localhost'],
    port=9200

)

MyFile= open("C:\ElasticSearch\shakespeare_6.0.json",'r').read()
ClearData = MyFile.splitlines(True)
i=0
json_str=""
docs ={}
for line in ClearData:
    line = ''.join(line.split())
    if line != "},":
        json_str = json_str+line
    else:
        docs[i]=json_str+"}"
        json_str=""
        print(docs[i])
        es.index(index='shakespeare', doc_type='Blog', id=i, body=docs[i])
        i=i+1
  

Screenshot: Output of the command running in Python

We can check the uploaded data using the below Python code.

es = Elasticsearch(
    ['localhost'],
    port=9200
)
es = Elasticsearch(ES_CLUSTER)
with open("C:\ElasticSearch\shakespeare_6.0.json") as json_file:
    json_docs = json.load(json_file)
es.bulk(ES_INDEX, ES_TYPE, json_docs)

Screenshot: Output of the command running in Python

It can also be verified from Kibana Dev console (if Kibana is already installed)

Kibana GET command

Screenshot: With Kibana GET command and output in the right side

I hope this post might have helped you. Please comment and let me know your thoughts!!

AUTHOR

Pranay

A Software Engineer by profession, a part time blogger and an enthusiast programmer. You can find more about me here.


Post a comment




Thank you! You are now subscribed.

Sign up for our newsletter

Subscribe to receive updates on our latest posts.

Thank you! You are now subscribed.