Monday, 13 July 2020

Blog 5 Python Series from Dictionary Attributes & Methods of Series( )

Learning Objectives:

 

* Understanding the various types of data type in Python and the correct implementation of these

* A fast recap over the data type 'dictionary' and how is it different from the other data types

* How to create a pandas series through an existing dictionary

* Understanding the purpose of various attributes and methods of the Construct Series( )

 

Text highlighted in blue colour to be pen down in the IP register along with the code.


Certain points/terms which should be clear before we start 


Dictionary - a set of key: value pairs, 
with the requirement that the keys are unique (within one dictionary). 

A Dictionary can be created in 2 ways 
1. By the dictionary object - 

A pair of curly braces creates an empty dictionary: {}

Placing a comma-separated list of key value pairs within the braces adds initial key:value pairs to the dictionary; this is also the way dictionaries are written on output.

# Creating a dictionary type variable named diction1

diction1 = {                                                    
            "Name": "Kamalan",
            "Age": 34,
            "Address": "High Level Address"
                   }
print(diction1)



Memory Allocation - 


2. By the construct / method dict( ) - 

The values are assigned to the keys with the assignment operator ( = ) as pairs, separated by commas and also note that the keys are not enclosed within quotes like in a dictionary variable.

See the coding with dict( ) for the same above data  -

diction1 = dict(                                # Creating a dictionary through the method dict()
                Name="Kamalan",
                Age=34,
                Address="High Level Address")
print(diction1)



Operations on Dictionary -

1. Creating -  (Creating as a variable and through the dict( ))

diction1 = dict(                               
                Name="Kamalan",
                Age=34,
                Address="High Level Address")


2. Printing -  

    a. The entire dictionary by print(diction1)

    b. The values of the dictionary of each item by iterating the values( ) 
         for x in diction1.values( ):
              print(x)

    c. The keys of the dictionary of each item by iterating the keys( ) 
         for x in diction1.keys( ):
              print(x)

    d. The keys and values of each item (in a tabular form) of the dictionary by the items( )
        for x,y in diction1.items( ):
              print(x,y)



3.  Assigning / Extracting  a key value pair - 

Assign the new value enclosed within quotes to a new key in an existing dictionary by referring the new key name inside the square bracket within quotes. 

Eg - To assign a new key as "E-mail" with value "kamal1986@gmail.com" to dictionary 

diction1["E-mail"]="kamal1986@gmail.com"


a. Access the values of a dictionary  by referring to its key name, inside square bracket.
 
For eg - extractkey=diction1["Name"] 
Output - 'Kamalan'

b. Access the values of a dictionary using  get( ) referring to its key name, as an argument.

For eg - print(diction1.get("Name"))
Output - 'Kamalan'

** It is an error to extract a value using a non-existent key.

4. Deletion - 

a. pop() method removes the item with the specified key name:
b. popitem() method removes the last inserted item (in versions before 3.7, a random item is 
     removed instead):
c. del keyword removes the item with the specified key name:
d. The del keyword can also delete the dictionary completely:

diction1 = dict(                               
                Name="Kamalan",
                Age=34,
                Address="High Level Address"
                )
print("The original dictionary is : ", diction1)

# Before deleting items lets add some

diction1["E-mail"]="kamal1986@gmail.com"
diction1["PhoneNo"]=9911002345

print("\nThe new dictionary with two more items added is : ")
for x,y in diction1.items():
    print(x,y)
    
# Start deleting the items 

diction1.pop("PhoneNo")
print("\nDictionary with deleted")
print("\n1. Phone No -",diction1)

diction1.popitem()
print("\n2. Last inserted item -",diction1)

del diction1["Age"]
print("\n3. Age  - ",diction1)




Output -



del diction1
print(diction1)                                                 #this will cause an error because "diction1" no longer exists.


5. Replacingcan change the current value of a specific item by referring to its key name: and assigning that particular key with a new value.

diction1["Age"]=24

It is an error to replace a value using a non-existent key.

6. Checking an existing key

diction2 = {
  "Brand": "Ford",
  "Model": "Mustang",
  "Year": 1964
            }
if  "Model"  in  diction2:
     print("Yes, 'Model' is one of the keys in the diction2 dictionary")

Output - Yes, 'Model' is one of the keys in the diction2 dictionary



Dictionary Methods

Python has a set of built-in methods that you can use on dictionaries.

MethodDescription
clear()Removes all the elements from the dictionary
copy()Returns a copy of the dictionary
fromkeys()Returns a dictionary with the specified keys and value
get()Returns the value of the specified key
items()Returns a list containing a tuple for each key value pair
keys()Returns a list containing the dictionary's keys
pop()Removes the element with the specified key
popitem()Removes the last inserted key-value pair
setdefault()Returns the value of the specified key. If the key does not exist: insert the key, with the specified value
update()Updates the dictionary with the specified key-value pairs
values()Returns a list of all the values in the dictionary

Creating a Pandas Series from Dictionary

A Pandas Series is a labeled (indexed) array that holds data.

Series(data [, index]) - is the construct( )of the library Pandas (So import pandas to use this method). This method converts the data (dictionary here) in its arguments into a series. index allows to rearrange the order of the items.

a. Creating a pandas series using Series( data) 

import pandas as pa
diction2 = {
                      "Brand": "Ford",
                      "Model": "Mustang",
                      "Year": 1964
                    }
print("The dictionary is = ", diction2)
Myseries=pa.Series(diction2)
print("The series created from the above dictionary is = ", Myseries)

b. Creating a pandas series with the argument/ parameter index - this keyword may be used when the order of the key-value needs to be changed. 

import pandas as pa
diction2 = {
                      "Brand": "Ford",
                      "Model": "Mustang",
                      "Year": 1964
                    }
print("The dictionary is = ", diction2)
Myseries=pa.Series(diction2, index =["Year", "Brand", "Model"] )
print("The series created from the above dictionary is = ", Myseries)


When we pass an index value which actually does not exist as a key in the defined dictionary during the creation of a series then the series assigns a value NaN to such an index.




Task 1 Write a Python program to convert a dictionary to a Pandas series

Task 2 Write a Python program to convert a NumPy array to a Pandas series.

Task 3 Write a Pandas program to convert a given Series to an array.

Task 4 Write a Pandas program to change the order of index of a given series


Some attributes and methods of Series (Not all the attributes and methods of Series will support Series created from a dictionary)
** name - attribute enables to assign a name to the Series_object

import pandas as pa
diction2 = {
            "Brand": "Ford",
                     "Model": "Mustang",
                     "Year": 1964
            }
MyS=pa.Series(diction2)
print(MyS)
MyS.name = 'Mustang is in'
print( )
print(MyS) 


MyS.name = 707 - Name: 707, dtype: object




** index.name - attribute enables to assign a name to the heading of the row-label / index column of the series.

MyS.index.name = "Features" 

Features
Brand       Ford
Model    Mustang
Year        1964
dtype: object

MyS.index.name = "Features"
MyS.name= "It's the best"

Features
Brand       Ford
Model    Mustang
Year        1964
Name: It's the best, dtype: object



** values - attribute enables to print the list([  ]) of all the values of a series.
(no keys no indexes)

print(MyS.values) -  ['Ford' 'Mustang' 1964]

diction2 = {

                        " "         : "Ford",

                        "Model": " ",

                        "Year" : 1964

                    }

MyS=pa.Series(diction2)

print(MyS.values)

print(S1.values) - [100 200 300 400 100]


** size - attribute enables to show the number of values of the series.

MyS.size - 3
S1.size - 5


** empty - 

1. at - attribute enables us to access a single value for a row/column label pair.

series_object.at[key  address]

** print("The value at 'Model' is ", Myseries[1])

print("The value at 'Model' is ", Myseries.at["Model"])


2. loc[ key] - attribute will return the value stored at the specified key location.
       
        When a dictionary is converted to a Series or let us say when a Series is created from a dictionary the indexing is the on the keys.

So if we pass the index address in the key argument of the loc attribute it will return Type Error.

    
   
 TypeError: cannot do label indexing on <class 'pandas.core.indexes.base.Index'> with these            indexers [0] of <class 'int'>

When the argument value of key is passed as the exact names defined in the key then will return the 'value' of   the respective key.


3. iloc[ index ] - attribute stands for ' index location' will return the respective value stored at the specified index address which will match the respective key value.



Methods of the Construct Series( )
** head( ) -
** tail( ) -
** count( ) -

1. to_numeric( ) - 



import pandas as pd

d1={ '1':100, '2':200, '3':'Python', '4': 300.12, '5':400}

s1 = pd.Series(d1)

print("Original Data Series:")

print(s1)

print("Change the said data type to numeric:")

s2 = pd.to_numeric(s1)

print(s2)



2. sort_values( ) -


import pandas as pd

d1={ '1':600, '2':200, '3':150, '4': 300.12, '5':400}

s1 = pd.Series(d1)

print("Original Data Series:", s1)

new_s = s1.sort_values()

print(new_s)






3. abs( series_object)


Returns a Series with absolute numeric value of each element. This function only applies to elements that are all numeric.


















TypeError: bad operand type for abs(): 'str'


4.   add(series_objects[, fill_value] ) will add (mathematically)the respective matching key values of the series_objects and will show "NaN" as the value for unmatching keys.





5. agg( 'kwargs') - agg is short for aggregate and this function allows to calculate the aggregate values like minimum, maximum, average on the basis of mean and median, of the given numeric series.



6. aggregate( func='kwargs') - aggregate function allows to calculate the aggregate values like minimum, maximum, average on the basis of mean and median, of the given numeric series same as agg( ) but with a difference here the keyword 'func' is used to assign it with the desired statistical operation name.


           s1.aggregate(func='sum')              <-->              s1.agg('sum')


**Kwargs 


7. append( series_objects ) - to add two or more series one after the other.


Concatenating more than one series (Dictionary Based)


7. transform( lambda  kwargs ) - to change / update the values of the series created from a dictionary by specifying the series as a user defined variable.


Lambda functions are small functions usually not more than a line. It can have any number of arguments just like a normal function. The body of lambda functions is very small and consists of only one expression.


But before this let us recall how to create a user-defined function / method 


def ami(x,y):

       return x+y

x = 10

y = 20


print(ami(x,y))


Output - 30




Now the same above task can also be done by lambda function -


print( (lambda x, y : x + y) (10,20) )


30


Lambda function can be used to save both coding and time.


Lambda in Series --> Lambda function can be used to update the data of series when used as kwarg to transform( ). This function will let the user defined variable be introduced immediately after its name and the update / mathematical calculation / statistical calculation will be followed by the colon.


series_object.transform( lambda variable : variable update operation )


x-x.sum( )

x.sum( ) = 100 + 90+ 100 + 90 =380

x-x.sum( ) = 100-380 = -280


import pandas as pd

diction1= {'Stud1':100, 'Stud2':98, 'Stud3':85, 'Stud4':88}

s1=pd.Series(diction1)


print(s1.transform(lambda y:y+y.sum()))

print(s1.transform(lambda x: x-x.mean()))





















8. to_list(  ) - to convert a series to a python list.




Output - The values of series are converted into list items and are enclosed within square bracket.


** Nested dictionary = 1 dictionary ( A series made from nested dictionary)


import pandas as pd

nd={'d1':{'Stud1':100, 'Stud2':95}, 

         'd2':{'Stud3':95,   'Stud4':98}

        }

s1= pd.Series(nd)    

print("Nested Dictionary", nd)

print("One Series\n", s1)


** The individual dictionaries of a nested dictionary can have any data and of any number 




When such a series(made up of a nested dictionary)  is converted to a list the output list will be enclosed in angular braces but each dictionary will be enclosed within curly braces (still a dictionary).



9. apply( lambda / [(pd.Series).func( )] ) - will  perform the custom / user defined operation on the elements or values of the Series (row-wise).


apply( lambda )


import pandas as pd

d1={'Stud1':100, 'Stud2':75}

s1= pd.Series(d1)

print("Series from Dictionary\n", s1)

result = s1.apply(lambda x : 'EXCELLENT' if x>90 else 'WORK HARDER') 

print(result)




 ** print(s1.apply(lambda y:y+y.sum()))



apply( pd.Series ).func( )



Functions / Methods of apply( ) - 


1. stack ( )


import pandas as pd

nd= {'d1':{'Stud1':100, 'Stud2':95}, 

         'd2':{'Stud3':95, 'Stud4':98}}

s1= pd.Series(nd)    

print("Series from Dictionary (Original) ---\n", s1)

s = s1.apply(pd.Series).stack( )

print("\nThe stacked Series is ---\n", s)




Single dictionary series when stacked will return the same series as original


2. mathematical / statistical / aggregation function


s.apply(pd.Series).sum( )





s2 = s.apply(pd.Series).max()

print("\nApplied with func max() \n", s2)



Output - 

Applied with func max()

 0    100.0
dtype: float64


** These functions ( ) returns the value considering all the elements / values of the Series but we cannot have an output for desired individual row or column, otherwise the data should be introduced as dataframe.



10. type( series_object ) - to  show the type of the series being passed and returns class type of the argument(object) passed as parameter



import pandas as pd

nd={'d1':{'Stud1':100, 'Stud2':95}, 

         'd2':{'Stud3':95, 'Stud4':98}}

s1= pd.Series(nd)    

print("Series from Dictionary")

print("The type of the Series is ", type(s1))



OUTPUT-  

Series from Dictionary

The type of the Series is  <class 'pandas.core.series.Series'>


import pandas as pd

nd={'d1':{'Stud1':100, 'Stud2':95}, 

         'd2':{'Stud3':95, 'Stud4':98}}

s1= pd.Series(nd)    

print("Series from Dictionary")

print("The type of the Series is ", type(s1))

s = s1.apply(pd.Series).stack()

print("One Series")

print("The type of nested series is", type(s))

print("The series converted to list is ", s.to_list())

print("The type of list is ", type(s))



OUTPUT  -
Series from Dictionary
The type of the Series is  <class 'pandas.core.series.Series'>
One Series
The type of nested series is <class 'pandas.core.series.Series'>
The series converted to list is  [100.0, 95.0, 95.0, 98.0]

No comments:

Post a Comment