admin管理员组

文章数量:1401833

I have data in pandas dataframes, where I make plots of several subsets of the data, using Altair. In all the plots, colours are taken from some categorical column. To make the charts easy to read, I want each category to have the same colour in all the charts, even if the category selection differs between charts. I also want to use the same colour scheme as I use otherwise, let's say the default one.

Simple MWE:

import streamlit as st
import altair as alt
from vega_datasets import data
iris = data.iris()

ch1 = alt.Chart(iris).mark_point().encode(
    x='petalWidth',
    y='petalLength',
    color='species'
)

.

Now, let's say that I for some reason filter the data so that 'versicolor' disappears:

ch2 = alt.Chart(iris[iris.species != 'versicolor']).mark_point().encode(
    x='petalWidth',
    y='petalLength',
    color='species'
)

.

There, 'virginica' changed colour, which is what I want to avoid.

My current solution is:

scheme_col = ['#4e79a7', '#f28e2b', '#e15759']
sel_idx = [0, 2]
sel_col = [scheme_col[i] for i in sel_idx]

ch3 = alt.Chart(iris[iris.species != 'versicolor']).mark_point().encode(
    x='petalWidth',
    y='petalLength',
    color=alt.Color('species').scale(range=sel_col)
)

I get the result I want, but this approach involves two manual steps:

  1. I need scheme_col, i.e. the colours of the current scheme. I did not find a way to get this programmatically.
  2. I need to know which categories are present in each plot - though I guess this should not be that difficult to get, by comparing the dataframes.

Is there a more elegant way of achieving this?

本文标签: pythonin Altaircan I assign fixed colour indices to data valuesStack Overflow