How can I create xticks with varying intervals?

107 Views Asked by At

I am drawing a line chart using matplotlib as shown below.

import matplotlib.pyplot as plt
import pandas as pd
import io

temp = u"""
tenor,yield
1M,5.381
3M,5.451
6M,5.505
1Y,5.393
5Y,4.255
10Y,4.109
"""

data = pd.read_csv(io.StringIO(temp), sep=",")
plt.plot(data['tenor'], data['yield'])

Output: The tick intervals on the x-axis are all the same.

enter image description here

What I want : Set the tick interval of the x-axis differently as shown in the screen below

Is there any way to set tick intvel differently?

enter image description here

1

There are 1 best solutions below

4
On BEST ANSWER

In the column 'tenor', 'M' represents month and 'Y' represents year. Create a 'Month' column with 'Y' scaled by 12 Months.

It's more concise to plot the data directly with pandas.DataFrame.plot, and use .set_xticks to change the xtick-labels.

Tested in python 3.12.0, pandas 2.1.1, matplotlib 3.8.0

data = pd.read_csv(io.StringIO(temp), sep=",")

# Add a column "Month" from the the column "tenor"
data["Month"] = data['tenor'].apply(lambda x : int(x[:-1]) *12 if 'Y' in x else int(x[:-1]))

# plot yield vs Month
ax = data.plot(x='Month', y='yield', figsize=(17, 5), legend=False)

# set the xticklabels
_ = ax.set_xticks(data.Month, data.tenor)

The output:

Output


.apply with a lambda function is fastest

  • Given 6M rows
data = pd.DataFrame({'tenor': ['1M', '3M', '6M', '1Y', '5Y', '10Y'] * 1000000})

Compare Implementations with %timeit in JupyterLab

  • .apply & lambda
%timeit data['tenor'].apply(lambda x: int(x[:-1]) *12 if 'Y' in x else int(x[:-1]))

2 s ± 50.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • .apply with a function call
def scale(x):
    v = int(x[:-1])
    return v * 12 if 'Y' in x else v

%timeit data['tenor'].apply(scale)

2.02 s ± 20.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Vectorized np.where with assignment expression
%timeit np.where(data.tenor.str.contains('Y'), (v := data.tenor.str[:-1].astype(int)) * 12 , v)

2.44 s ± 26.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Vectorized np.where without assignment expression
%timeit np.where(data.tenor.str.contains('Y'), data.tenor.str[:-1].astype(int) * 12 , data.tenor.str[:-1].astype(int))

3.36 s ± 5.38 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)