Pandas: Selecting Rows (And Columns) With loc[]
¶
import pandas as pd
persons = pd.DataFrame({
'firstname': ['Joerg', 'Johanna', 'Caro', 'Philipp' ],
'lastname': ['Faschingbauer', 'Faschingbauer', 'Faschingbauer', 'Lichtenberger' ],
'email': ['jf@faschingbauer.co.at', 'johanna@email.com', 'caro@email.com', 'philipp@email.com'],
'age': [56, 27, 25, 37 ],
})
Rows (And Columns) By Label¶
Label?
⟶ Default index (more on indexes) is integer, so … just the same as
iloc
persons.loc[0]
firstname Joerg lastname Faschingbauer email jf@faschingbauer.co.at age 56 Name: 0, dtype: object
persons.loc[[0,1]]
firstname lastname email age 0 Joerg Faschingbauer jf@faschingbauer.co.at 56 1 Johanna Faschingbauer johanna@email.com 27 More power: Pandas: Filters
Hiccup: Slices Are Inclusive¶
Contrary to
iloc[]
, the end of a slice specifier is included in the slicepersons.loc[0:1]
firstname lastname email age 0 Joerg Faschingbauer jf@faschingbauer.co.at 56 1 Johanna Faschingbauer johanna@email.com 27 Why? Read on
Column Selection By Label¶
persons.loc[0, ['firstname', 'age']]
firstname Joerg
age 56
Name: 0, dtype: object
persons.loc[[0, 1], ['firstname', 'age']]
firstname | age | |
---|---|---|
0 | Joerg | 56 |
1 | Johanna | 27 |
Columns By Slicing: Inclusive¶
persons.loc[1, 'firstname' : 'age']
firstname Johanna
lastname Faschingbauer
email johanna@email.com
age 27
Name: 1, dtype: object
Not consistent with Python’s definition of ranges
… but user friendly (hard to understand why
'age'
had to be left out)Rant: does slicing by column name bear any value?
Summary¶
Attention: inconsistent with rest of Python (and
iloc[]
)More (absolute) power by using filters with
loc[]