Exploring Blockchain data with Covalent API

Onchain-data in Python

Nov 24, 2023

Last Update: 21/11/2023
Blockchain: Binance Smart Chain
API use: Covalent

Introduction

We intend to utilize the Covalent API, which yields a JSON response that we subsequently transform into a dataframe. We navigate through the data by scrutinizing specific columns and constructing various data frames. This approach serves as a means not only to grasp the process of transforming a JSON response into a dataframe but also to acquaint ourselves with the data associated with the 'get block' endpoint.

Setting Up

Instalations

!python -m pip install requests
!pip install --upgrade pip
!pip3 install covalent-api-sdk
!pip install python-dotenv

Import Libraries

import requests
from requests.auth import HTTPBasicAuth
import pandas as pd
import json
from datetime import datetime
import os
import dotenv
from dotenv import load_dotenv

Environment setup

By placing your API key in a .env file and loading it into your Python script, you can keep sensitive information separate from your code, which is a good practice for security and configuration management.

# If you need, change the current working directory where .env file lives.
os.chdir("/Users/Oscar/Documents/Python/Covalent")

# load environment variables from .env
dotenv.load_dotenv(".env")

True

# Get the API key from the environment variable
api_key = os.getenv("API_KEY")

# Check if the API key is available
if not api_key:
    raise ValueError("API_KEY not found in environment variables")

Block Height

The number of blocks in a blockchain, often referred to as the blockchain's "block height," indicates how many blocks have been added to the blockchain since its inception or the last reset. Each block in a blockchain typically contains a batch of transactions and references the previous block, forming a chronological and immutable chain of blocks. Some considerations:

The number of blocks shows the progress of the blockchain, indicating how far it has advanced from its genesis block. Each new block adds to the chain, demonstrating the ongoing operation of the blockchain network.
A higher number of blocks generally indicates a more secure and tamper-resistant blockchain. As more blocks are added, it becomes increasingly difficult to alter or manipulate transactions in earlier blocks, as doing so would require changing all subsequent blocks.

This time we are using the endpoint get block to get some basic data about the block.

Using the Covalent API to make our request

Url: This sets up the URL for the Covalent API endpoint for Ethereum mainnet block information. The url format is https://api.covalenthq.com/v1/%7BchainName%7D/block_v2/%7BblockHeight%7D/
Headers: The accept header specifies that the client (your code) can handle a response in JSON format.
HTTPBasic Auth class from the requests library to set up basic authentication. It expects an API key, and an empty string is provided as the password.
You should replace 'api_key' with your actual Covalent API key.
HTTP GET request to the specified Covalent API endpoint (url) with the provided headers and authentication.
Response from the API is stored in the response variable.

# {chainName} in this case is bsc-mainnet and  {blockHeight} is 1, because we want to get the first block (genesis block).  You can use other blocknumber.
url = "https://api.covalenthq.com/v1/bsc-mainnet/block_v2/1/"
headers = {
    "accept": "application/json",
}
basic = HTTPBasicAuth('api_key', '')   
response = requests.get(url, headers=headers, auth=basic)
print(response.text)

{"data":{"updated_at":"2023-11-20T18:03:16.980284701Z","chain_id":56,"chain_name":"bsc-mainnet","items":[{"signed_at":"2020-08-29T03:24:09Z","height":1}],"pagination":null},"error":false,"error_message":null,"error_code":null}

Description of the response

The JSON data is the response from an API, it is related with the block data of the first produced in the Ethereum Blockchain. related to blockchain or cryptocurrency information, specifically for the Ethereum mainnet (chain_id: 1). Let's break down the key components of the JSON structure:

Data Section

updated_at: Indicates the timestamp when the data was last updated. In this case, it is "2013-11-10T15:22:56.652466545Z."
chain_id: Represents the identifier for the blockchain, and in this instance, it is set to 1.
chain_name: Specifies the name of the blockchain, which is "bsc-mainnet" for the binance smart chain main network.
items: Contains an array with, signed_at: Indicates the timestamp when the block or event was signed. In the provided example, it is "2015-07-30T15:26:28Z." and height: Represents the height of the block or event. In this case, the height is 1.

Pagination Section

pagination: Appears to be null, indicating that there is no pagination information provided in this response.

Error Handling:

error: Indicates whether an error occurred. In this case, it is set to false, suggesting that the request was successful.
error_message: If an error occurred, this field would contain a message describing the error. In this example, it is null.
error_code: Similarly, if an error occurred, this field would contain a specific error code. In this case, it is also null.

# Parse JSON data
parse_data = json.loads(response.text)
#Checking type of parse_data
print(type(parse_data)) 
print(parse_data)

<class 'dict'>
{'data': {'updated_at': '2023-11-20T18:03:16.980284701Z', 'chain_id': 56, 'chain_name': 'bsc-mainnet', 'items': [{'signed_at': '2020-08-29T03:24:09Z', 'height': 1}], 'pagination': None}, 'error': False, 'error_message': None, 'error_code': None}

Let´s Check parse_data

.DataFrame(parse_data)
df.head()

dataerrorerror_messageerror_codechain_id56FalseNoneNonechain_namebsc-mainnetFalseNoneNoneitems[{'signed_at': '2020-08-29T03:24:09Z', 'height...FalseNoneNonepaginationNoneFalseNoneNoneupdated_at2023-11-20T18:03:16.980284701ZFalseNoneNone

# We can Flatten the entire JSON structure using pd.json_normalize
flattened_data = pd.json_normalize(parse_data,sep='_')

flattened_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 8 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   error            1 non-null      bool  
 1   error_message    0 non-null      object
 2   error_code       0 non-null      object
 3   data_updated_at  1 non-null      object
 4   data_chain_id    1 non-null      int64 
 5   data_chain_name  1 non-null      object
 6   data_items       1 non-null      object
 7   data_pagination  0 non-null      object
dtypes: bool(1), int64(1), object(6)
memory usage: 185.0+ bytes

flattened_data.head()

errorerror_messageerror_codedata_updated_atdata_chain_iddata_chain_namedata_itemsdata_pagination

0FalseNoneNone2023-11-20T18:03:16.980284701Z56bsc-mainnet[{'signed_at': '2020-08-29T03:24:09Z', 'height...None

Let´s check 'data' data frame

data = parse_data['data']
df_data = pd.json_normalize(data)
df_data.head()

updated_at chain_id chain_name items pagination

0 2023-11-20T18:03:16.980284701Z5 6 bsc-mainnet [{'signed_at': '2020-08-29T03:24:09Z', 'height...None

Let's print an specific column in 'data'

updated_at = parse_data['data']['updated_at'] 
updated_at

'2023-11-20T18:03:16.980284701Z'

Let's check inside the 'items' and the columns in items, updated_at and signed_at

items = parse_data['data']['items'][0]
items

{'signed_at': '2020-08-29T03:24:09Z', 'height': 1}

#Signed_at column inside Items
print(parse_data['data']['items'][0]['signed_at'])

2020-08-29T03:24:09Z

#height column inside Items
print(parse_data['data']['items'][0]['height'])

# Flatten the nested items JSON structure
items = parse_data['data']['items'][0]
df_items = pd.json_normalize(items)
df_items.head()

signed_atheight02020-08-29T03:24:09Z1

Finally I wanted to make a data frame with the columns "chain name", and the columns inside "items"

# Extract relevant data
chain_name = parse_data['data']['chain_name']
signed_at = parse_data['data']['items'][0]['signed_at']
height = parse_data['data']['items'][0]['height']

# Convert the string timestamp to a datetime object
# signed_at = datetime.strptime(signed_at, '%Y-%m-%dT%H:%M:%SZ')

# Convert the string timestamp to a datetime object and extract the date
signed_at = datetime.strptime(signed_at, '%Y-%m-%dT%H:%M:%SZ').date()

# Create a DataFrame
df = pd.DataFrame({'chain_name': [chain_name], 'signed_at': [signed_at], 'height': [height]})

df.head()

chain_namesigned_atheight0bsc-mainnet2020-08-291

There is another way:

df = pd.json_normalize(data, record_path=['items'], meta=['chain_name']) 
print(df)

              signed_at  height   chain_name
0  2020-08-29T03:24:09Z       1  bsc-mainnet

Contact:

[Linkedin](https://www.linkedin.com/in/oscarquirogap/)
[WebPage](http://onanalytics.co/contact)
[GitHub](https://github.com/On-Analytics/Access_To_Blockchain_Data.git)

OnAnalytics

Discussion about this post

Ready for more?