So .loc and .iloc are not your typical functions. They somehow use [ and ] to surround the arguments so that it is comparable to normal array indexing. However, I have never seen this in another library (that I can think of, maybe numpy as something like this that I'm blanking on), and I have no idea how it technically works/is defined in the python code.
Are the brackets in this case just syntactic sugar for a function call? If so, how then would one make an arbitrary function use brackets instead of parenthesis? Otherwise, what is special about their use/defintion Pandas?
Note: The first part of this answer is a direct adaptation of my answer to this other question, that was answered before this question was reopened. I expand on the "why" in the second part.
Indeed, they are not functions at all. I'll make examples with
loc
,iloc
is analogous (it uses different internal classes). The simplest way to check whatloc
actually is, is:which prints
this tells us that
df.loc
is an instance of a_LocIndexer
class. The syntaxloc[]
derives from the fact that_LocIndexer
defines__getitem__
and__setitem__
*, which are the methods python calls whenever you use the square brackets syntax.So yes, brackets are, technically, syntactic sugar for some function call, just not the function you thought it was (there are of course many reasons why python is designed this way, I won't go in the details here because 1) I am not sufficiently expert to provide an exhaustive answer and 2) there are a lot of better resources on the web about this topic).
*Technically, it's its base class
_LocationIndexer
that defines those methods, I'm simplifying a bit hereI'm entering speculation area here, because I couldn't find any document explicitly talking about design choices in Pandas, however: there are at least two good reasons I see for choosing the square brackets.
The first, and most important reason is: you simply can't do with a function call everything you do with the square-bracket notation, because assigning to a function call is a syntax error in python:
Using round brackets for a "function" call, calls the underlying
__call__
method (note that any class that defines__call__
iscallable
, so "function" call is an incorrect term because python doesn't care whether something is a function or just behaves like one).Using square brackets, instead, alternatively calls
__getitem__
or__setitem__
depending on when the call happens (__setitem__
if it's on the left of an assignment operator,__getitem__
in any other case). There is no way to mimic this behaviour with a function call, you'd need a setter method to modify the data in the dataframe, but it still wouldn't be allowed in an assignment operation:This example brings me to the second reason: consistency. You can access elements of a DataFrame via square brackets:
when using
loc
you're still trying to refer to some items in the DataFrame, so it's more consistent to use the same syntax instead of asking the user to use some getter and setter functions (it's also, I believe, "more pythonic", but that's a fuzzy concept I'd rather stay away from).