More pandas series practice
Level: Intermediate (score: 3)
In Introducing pandas series and Let's play with pandas series we looked at creating some simple pandas
Series and then we looked at how to retrieve values and slices of values from the series. In this final bite based on Series we'll look at some options available to you to change the values of the elements in the series using some basic maths type manipulation. Then we look at creating some masks.
Series Maths
We'll start with two bites that perform some maths type manipulation to a pandas
series.
- In the first part we write a function that takes a Series, a function (addition, subtraction, multiplication and division) and an integer value. The task is to apply the value and the function to each value in the series.
- In the second part, instead of applying the function to a series and an integer, the function is applied to two series. Hint: Keep in mind the indexes of both of the series
Series Masks
We'll complete this little mini path on pandas
Series by looking at creating masks. As mentioned in the code comments don't confuse masks in this context with the pandas.Series.mask. This is a very powerful and useful method but not what we're looking for here. For parts three and four we want a Boolean Mask:
In both NumPy and Pandas we can create masks to filter data. Masks are ’Boolean’ arrays – that is arrays of true and false values and provide a powerful and flexible method to selecting data.
- For the third part of this bite we simply need to create a mask to filter certain letters from a series of letters. You don't need to worry about case or anything like that (anyhow changing case is changing the contents of the series so probably wouldn't want to do that).
-
As per Computer Software for Quartiles: The Excel function
QUARTILE(array, quart)
provides the desired quartile value for a given array of data. In the Quartile function, array is the dataset of numbers that is being analyzed and quart is any of the following 5 values depending on which quartile is being calculated e.g.| Quart | Output QUARTILE Value | | :---: | :------------------------------: | | 0 | Minimum value | | 1 | Lower Quartile (25th percentile) | | 2 | Median | | 3 | Upper Quartile (75th percentile) | | 4 | Maximum value |
For this part not only do we want the Median (the 50th percentile or second quartile value), we also want the mean value. The requirement is to take a series of
floats
and return all values in the series that are within the given range. So for this you need to create a mask on which to filter the series, apply the mask and then return the series result.
Of course these snippets are not all as easy and straight forward as they seem. You'll need to refer to the docstrings
and the tests to really fully understand the requirements.