DATA ENGINEERING Pipeline
An automated Python pipeline utilizing the CoinMarketCap API to track, analyze, and visualize real-time trends of top cryptocurrencies.
This project builds a robust pipeline for financial data analysis. By connecting to the CoinMarketCap API, it pulls live market data, cleans and structures it into a Pandas DataFrame, and stores it locally for historical tracking. The final step involves sophisticated data visualization to map percentage changes over various timeframes (1h, 24h, 7d, 30d).
Handles secure API requests with headers and parameters to fetch live crypto assets.
Appends new data to a local CSV file to build a historical dataset over time.
Calculates mean percentage changes across 5 different time intervals.
Below is a dynamic representation of the market data generated by the project. It visualizes the volatility (Percentage Change) of top assets over time.
We configure the request headers with our API key and define parameters to fetch the top 15
currencies converted to USD using requests.
from requests import Session
import json
url = 'https://pro-api.coinmarketcap.com/v1/cryptocurrency/listings/latest'
parameters = {
'start': '1',
'limit': '15',
'convert': 'USD'
}
headers = {
'Accepts': 'application/json',
'X-CMC_PRO_API_KEY': 'YOUR-API-KEY-HERE'
}
session = Session()
session.headers.update(headers)
response = session.get(url, params=parameters)
data = json.loads(response.text)
The nested JSON response is normalized into a Pandas DataFrame. We then group by currency name to calculate average percentage changes.
import pandas as pd
# Normalize JSON to DataFrame
df = pd.json_normalize(data['data'])
# Calculate mean changes for visualization
df_viz = df.groupby('name', sort=False)[[
'quote.USD.percent_change_1h',
'quote.USD.percent_change_24h',
'quote.USD.percent_change_7d'
]].mean()
Finally, the data is reshaped to allow Seaborn to plot the time intervals on the X-axis and percentage change on the Y-axis.
import seaborn as sns
import matplotlib.pyplot as plt
# Reshape data for plotting
df_melted = df_viz.stack().to_frame().reset_index()
df_melted = df_melted.rename(columns={0: 'values', 'level_1': 'interval'})
sns.pointplot(x='interval', y='values', hue='name', data=df_melted)
plt.show()
This project demonstrates a complete data engineering workflow. Key takeaways include:
json_normalize and
melt to transform raw JSON into a format suitable for visualization.