admin管理员组文章数量:1123030
So I've been working with a LOT of data, but for simplicity we can use the sample data below. However, what I'm trying to do is plot a line chart that will let me see the extremes, compress the Y axis where no values are present, and expand the Y axis where data is present. The problem I have is the picture below. We can see that there is no data between 3,500 and 500, yet there is a huge gap, and then an almost solid line at the bottom.
What I'd like to have is the line chart be displayed where we also include the extreme (totals), but not have the huge gap between the sales data, and still be able to see the sales data like this, but also include the line at the top for the Totals:
Here's the code I have that does the charts so far, but I need to be able to apply this to a much larger set of data. The data set I would use for this would contain hundreds of "employees" across multiple stores. So the extreme values would be something like "Store_A":9560, "Store_B":6470, but the 'normal' values for the employee sales would only range between 0 and 300 (but 300 is variable, some weeks we've got guys that do more than 300).
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Sample data
data = {
'Date': ['2025-01-10', '2025-01-17', '2025-01-24', '2025-01-31'],
'Bob': [156, 60, 58, 62],
'Joe': [37, 40, 139, 42],
'Sally': [62, 265, 63, 67],
'Total': [3698, 3750, 3720, 3800]
}
# Create a DataFrame
df = pd.DataFrame(data)
# Melt the DataFrame to long format
df_melted = df.melt(id_vars='Date', var_name='Employee', value_name='Sales')
# Convert Date column to datetime
df_melted['Date'] = pd.to_datetime(df_melted['Date'])
# Create the line plot
plt.figure(figsize=(12, 8))
sns.lineplot(x='Date', y='Sales', hue='Employee', data=df_melted, marker='o')
# Set y-axis to display whole numbers only
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'{int(x):,}'))
# Add labels and title
plt.xlabel('Date')
plt.ylabel('Number of Sales')
plt.title('Employee Sales Data Over Time')
# Set custom y-axis ticks with greater spacing for lower numbers and compressed ranges without data
custom_ticks = [0, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] #+ list(range(200, 4001, 1000))
plt.gca().set_yticks(custom_ticks)
# Show the plot
plt.show()
So I've been working with a LOT of data, but for simplicity we can use the sample data below. However, what I'm trying to do is plot a line chart that will let me see the extremes, compress the Y axis where no values are present, and expand the Y axis where data is present. The problem I have is the picture below. We can see that there is no data between 3,500 and 500, yet there is a huge gap, and then an almost solid line at the bottom.
What I'd like to have is the line chart be displayed where we also include the extreme (totals), but not have the huge gap between the sales data, and still be able to see the sales data like this, but also include the line at the top for the Totals:
Here's the code I have that does the charts so far, but I need to be able to apply this to a much larger set of data. The data set I would use for this would contain hundreds of "employees" across multiple stores. So the extreme values would be something like "Store_A":9560, "Store_B":6470, but the 'normal' values for the employee sales would only range between 0 and 300 (but 300 is variable, some weeks we've got guys that do more than 300).
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Sample data
data = {
'Date': ['2025-01-10', '2025-01-17', '2025-01-24', '2025-01-31'],
'Bob': [156, 60, 58, 62],
'Joe': [37, 40, 139, 42],
'Sally': [62, 265, 63, 67],
'Total': [3698, 3750, 3720, 3800]
}
# Create a DataFrame
df = pd.DataFrame(data)
# Melt the DataFrame to long format
df_melted = df.melt(id_vars='Date', var_name='Employee', value_name='Sales')
# Convert Date column to datetime
df_melted['Date'] = pd.to_datetime(df_melted['Date'])
# Create the line plot
plt.figure(figsize=(12, 8))
sns.lineplot(x='Date', y='Sales', hue='Employee', data=df_melted, marker='o')
# Set y-axis to display whole numbers only
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'{int(x):,}'))
# Add labels and title
plt.xlabel('Date')
plt.ylabel('Number of Sales')
plt.title('Employee Sales Data Over Time')
# Set custom y-axis ticks with greater spacing for lower numbers and compressed ranges without data
custom_ticks = [0, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] #+ list(range(200, 4001, 1000))
plt.gca().set_yticks(custom_ticks)
# Show the plot
plt.show()
Share
Improve this question
asked 2 hours ago
Ryan BarnesRyan Barnes
1071 silver badge11 bronze badges
1
- 1 A broken y-axis would remove part of the y range. (If log scale fits your use case, that would be an easier option.) – JohanC Commented 1 hour ago
1 Answer
Reset to default 0What about a log scale on Y ?
plt.yscale('log')
本文标签: pandasPythonSeaborne line chart extremes and normal values legibleStack Overflow
版权声明:本文标题:pandas - Python + Seaborne line chart extremes and normal values legible? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736543876a1944420.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论