Using Twitter to forecast cryptocurrency returns #2 – Mining Cryptocurrency with CoinGecko API

Gearing up for returns forecasting with VAR Caption: Screen grab from CoinGecko After going through Tweets mining…

Hui Fang Yeo
November 16, 2020
Scroll to read the aricle

Using Twitter to forecast cryptocurrency returns #2 – Mining Cryptocurrency with CoinGecko API

Gearing up for returns forecasting with VAR

Caption: Screen grab from CoinGecko

After going through Tweets mining as explained in my first article, gathering cryptocurrency returns wasn’t difficult at all. I had decided to use pycoingecko, which is a Python wrapper around the CoinGecko API

The key 2 things to note are:

  • You need the id of the cryptocurrency in order to download the market data
  • the granularity of the data is automatically determined by the number of days you are downloading for

Getting id of cryptocurrency

Go to the CoinGecko API and execute the “/coin/list”. 

Get the ids of the coins required. In my case, I have the list below:

gecko_list = [
    "bitcoin",
    "ethereum",
    "ripple",  # xrp
    "tether",
    "bitcoin-cash",
    "cardano",
    "bitcoin-cash-sv",
    "litecoin",
    "chainlink",
    "binancecoin",
    "eos",
    "tron",
]

Downloading historical market data

I used get_coin_market_chart_by_id, the wrapper around /coins/{id}/market_chart to get my historical market data. The granularity of the data returned depends on the number of days we are getting:

  • minutely data for duration within 1 day
  • hourly data will be used for duration between 1 day and 90 days
  • daily data will be used for duration above 90 days

Since I do not have the luxury to collect hourly data beyond 90 days, I can only stick to daily market data. The code snippet below returns 300 days worth of historical market data against USD. I’m only going to store the prices returned for the given timestamp.

cg = CoinGeckoAPI()
timePeriod = 300

data = {}
for coin in gecko_list:
    try:
        nested_lists = cg.get_coin_market_chart_by_id(
            id=coin, vs_currency="usd", days=timePeriod
        )["prices"]
        data[coin] = {}
        data[coin]["timestamps"], data[coin]["values"] = zip(*nested_lists)

    except Exception as e:
        print(e)
        print("coin: " + coin)

frame_list = [
    pd.DataFrame(data[coin]["values"], index=data[coin]["timestamps"], columns=[coin])
    for coin in gecko_list
    if coin in data
]

Let’s convert the timestamp into a user friendly format:

df_cyptocurrency["datetime"] = pd.to_datetime(df_cyptocurrency.index, unit="ms")
df_cyptocurrency["date"] = df_cyptocurrency["datetime"].dt.date
df_cyptocurrency["hour"] = df_cyptocurrency["datetime"].dt.hour

Let’s align the data into a format that is more generic:

df_cyptocurrency = df_cyptocurrency.melt(
    id_vars=["datetime", "date", "hour"], var_name="currency_name", ignore_index=True
)
df_cyptocurrency.head(5)

Now, that was not too bad right? Gathering cryptocurrency returns was way more simple than mining tweets. The only confusing part during data collection was the granularity of the data as I overlooked the API method definitions. My notebook is available in the Atoti notebook gallery for your reference.

Now, let’s take a breather before I start performing my time-series analysis.

Join our Community

Join our thriving article community and start sharing your insights today!

Like this post? Please share

Latest Articles

View all

Retail Banking Analytics with Atoti

Make smarter decisions by analyzing consumer credit cards and risk profiles. Retail banking, otherwise known as consumer...

Putting Python Code in Production

Python is great for research, but can you really use it for production quality projects? Python, once...

Changes ahead: Atoti Server 6.0-springboot3

ActiveViam is upgrading version 6.0 of Atoti Server from Spring Boot 2 to Spring Boot 3 to...
Documentation
Information
Follow Us

Atoti Free Community Edition is developed and brought to you by ActiveViam. Learn more about ActiveViam at activeviam.com

Follow Us