Introduction to NumPy and
pandas
What is NumPy?
• NumPy is a Python library used for working with
arrays.
• It also has functions for working in domain of
linear algebra, fourier transform, and matrices.
• NumPy was created in 2005 by Travis Oliphant.
It is an open source project and you can use it
freely.
• NumPy stands for Numerical Python.
Why Use NumPy?
• In Python we have lists that serve the purpose of
arrays, but they are slow to process.
• NumPy aims to provide an array object that is up
to 50x faster than traditional Python lists.
• The array object in NumPy is called ndarray, it
provides a lot of supporting functions that make
working with ndarray very easy.
• Arrays are very frequently used in data science,
where speed and resources are very important.
Why is NumPy Faster Than Lists?
• NumPy arrays are stored at one continuous
place in memory unlike lists, so processes can
access and manipulate them very efficiently.
• This behavior is called locality of reference in
computer science.
• This is the main reason why NumPy is faster
than lists. Also it is optimized to work with
latest CPU architectures.
Vectorization in NumPy with Practical Examples
• Vectorization in NumPy is a method
of performing operations on entire arrays
without explicit loops. This approach leverages
NumPy’s underlying C implementation for
faster and more efficient computations.
• By replacing iterative processes with vectorized
functions, you can significantly optimize
performance in data analysis, machine
learning, and scientific computing tasks.
import numpy as np
a1 = np.array([2,4,6,8,10 ])
number= 2
result = a1 + number
print(result)
Output
[ 4 6 8 10 12]
Vectorization is significant because it:
• Improves Performance: Operations are faster
due to pre-compiled C-based
implementations.
• Simplifies Code: Eliminates explicit loops,
making code cleaner and easier to read.
• Supports Scalability: Efficiently handles large
datasets.
Adding two arrays together with vectorization
import numpy as np
a1 = np.array([1, 2, 3])
a2 = np.array([4, 5, 6])
result = a1 + a2
print(result)
Output[5 7 9]
Element-Wise Multiplication with array
import numpy as np
a1 = np.array([1, 2, 3, 4])
result = a1 * 2
print(result)
Output[2 4 6 8]
Logical Operations on Arrays
import numpy as np
a1 = np.array([10, 20, 30])
result = a1 > 15
print(result)
Output[False True True]
Matrix Operations Using Vectorization
import numpy as np
a1= np.array([[1, 2], [3, 4]])
a2 = np.array([[5, 6], [7, 8]])
result = np.dot(a1, a2)
print(result)
Output[[19 22] [43 50]]
Applying Custom Functions with Numpy Vectorize() Function
import numpy as np
def custom_func(x):
return x**2 + 2*x + 1
a1 = np.array([1, 2, 3, 4])
result = custom_func(a1)
print(result)
PANDAS
• Pandas is one of the most used libraries in
Python for data science or data analysis. It can
read data from CSV or Excel files, manipulate
the data, and generate insights from it. Pandas
can also be used to clean data, filter data, and
visualize data.
List of Important Pandas Functions
• Pandas read_csv() Function
• Pandas head() Function
• Pandas tail() Function
• Pandas sample() Function:This method is used to
generate a sample random row or column from the data
frame.
• Pandas info() Function:This method is used to generate
the summary of the DataFrame, this will include info
about columns with their names, their datatypes, and
missing values.
• Pandas dtypes() Function:This method returns a Series
with the data type of each column.
Python Pandas Series
• Pandas Series is a one-dimensional labeled
array capable of holding data of any type
(integer, string, float, python objects, etc.).
# import pandas as pd
import pandas as pd
# simple array
data = [1, 2, 3, 4]
ser = pd.Series(data)
print(ser)
Output
0 1
1 2
2 3
3 4
dtype: int64
Creating a Pandas Series
# import pandas as pd
import pandas as pd
# import numpy as np
import numpy as np
# simple array
data = np.array(['g','e','e','k','s'])
ser = pd.Series(data)
print(ser)
Output
0 g
1 e
2 e
3 k
4 s
dtype: object
Pandas Data Frames
• import pandas as pd
data =
{
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)
calories duration 0 420 50 1 380 40 2 390 45
calories duration 0 420 50 1 380 40 2 390 45
calories duration 0 420 50 1 380 40 2 390 45
Calories duration
0 420 50
1 380 40
2 390 45
Indexing And Slicing
• Indexing is the process of accessing an
element in a sequence using its position in the
sequence (its index).
• In Python, indexing starts from 0, which
means the first element in a sequence is at
position 0, the second element is at position 1,
and so on.
• my_list = ['apple', 'banana', 'cherry', 'date']
print(my_list[0]) # output: 'apple'
print(my_list[1]) # output: 'banana'
• Slicing in Python
• Slicing is the process of accessing a sub-
sequence of a sequence by specifying a
starting and ending index. In Python, you
perform slicing using the colon : operator.
• sequence[start_index:end_index]
• my_list = ['apple', 'banana', 'cherry', 'date']
print(my_list[1:3]) # output: ['banana',
'cherry‘]
• my_list = ['apple', 'banana', 'cherry', 'date']
print(my_list[:2]) # output: ['apple', 'banana']
print(my_list[2:]) # output: ['cherry', 'date']
• In the first line of the above code, we have
used slicing to get all the elements from the
beginning of my_list up to (but not including)
the element at index 2. In the second line, we
have used slicing to get all the elements from
index 2 to the end of my_list.
• numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
odd_numbers = numbers[::2]
print(odd_numbers) # output: [1, 3, 5, 7, 9]
• The ::2 slice means that we are selecting every
other element starting from the first element,
which correspond to the odd numbers in the
list
• The ::2 slice means that we are selecting every
other element starting from the first element,
which correspond to the odd numbers in the
list
Concatenation
• Concatenation is done by + operator.
Concatenation is supported by sequence data
types(string, list, tuple). Concatenation is done
between the same data types only.
• s1="Welcome" s2="to" s3="python"
s4=s1+s2+s3 print
(s4)#Output:Welcometopython
• s=1+2 print (s)#Output:3 print
(type(s))#Output:<class 'int'>
• s1='1'+'2' print (s1)#Output:12 print
(type(s1))#Output:<class 'str'>
• t1=(1,2)
• t2=(3,4)
• print (t1+t2)
• #Output:(1, 2, 3, 4)
Repetition
• Sequences datatypes (both mutable and
immutable) support repetition operator * The
repetition operator * will make multiple
copies of that particular object and combines
them together. When * is used with an integer
it performs multiplication but with list, tuple
or strings it performs a repetition
• s1="python" print (s1*3)
#Output:pythonpythonpython
• l1=[1,2,3]
• print (l1 * 3)
• #Output:[1, 2, 3, 1, 2, 3, 1, 2, 3]
Add/Remove Set Items
• Add items to set:
• If you want to add a single item to the set use
the add() method.
cities = {"Tokyo", "Madrid", "Berlin", "Delhi"}
cities.add("Helsinki")
print(cities)
Output:
{'Tokyo', 'Helsinki', 'Madrid', 'Berlin', 'Delhi'}
• cities = {"Tokyo", "Madrid", "Berlin", "Delhi"}
• cities2 = {"Helsinki", "Warsaw", "Seoul"}
• cities.update(cities2)
• print(cities)
• Output:
• {'Seoul', 'Berlin', 'Delhi', 'Tokyo', 'Warsaw',
'Helsinki', 'Madrid'}
Remove items from set:
• We can use remove() and discard() methods
to remove items form list.
Example 1:
cities = {"Tokyo", "Madrid", "Berlin", "Delhi"}
cities.remove("Tokyo")
print(cities)
pop():
This method removes the last item of the set but the catch is that we
don’t know which item gets popped as sets are unordered. However, you
can access the popped item if you assign the pop() method to a variable.
Example:
cities = {"Tokyo", "Madrid", "Berlin", "Delhi"}
item = cities.pop()
print(cities)
print(item)
Output:
{'Tokyo', 'Delhi', 'Berlin'}
Madrid
del:
del is not a method, rather it is a keyword which
deletes the set entirely.
Example:
cities = {"Tokyo", "Madrid", "Berlin", "Delhi"}
del cities
print(cities)
clear():
This method clears all items in the set and prints
an empty set.
Example:
cities = {"Tokyo", "Madrid", "Berlin", "Delhi"}
cities.clear()
print(cities)
Sorted() Method
This is a pre-defined method in python which sorts
any kind of object.
Syntax:sorted(iterable, key, reverse)
In this method, we pass 3 parameters, out of which
2 (key and reverse) are optional and the first
parameter i.e. iterable can be any iterable object
This method returns a sorted list but does not
change the original data structure.
# List
list_of_items = ['g', 'e', 'e', 'k', 's']
print(sorted(list_of_items))
# Tuple
tuple_of_items = ('g', 'e', 'e', 'k', 's')
print(sorted(tuple_of_items))
# String-sorted based on ASCII
# translations
string = "geeks"
print(sorted(string))
# Dictionary
dictionary = {'g': 1, 'e': 2, 'k': 3, 's': 4}
print(sorted(dictionary))
# Set
set_of_values = {'g', 'e', 'e', 'k', 's'}
print(sorted(set_of_values))
# Frozen Set
frozen_set = frozenset(('g', 'e', 'e', 'k', 's'))
print(sorted(frozen_set))
• The reverse() method reverses the elements
of the list in-place and it modify the original
list without creating a new list. This method is
efficient because it doesn’t create a new list.
a = [1, 2, 3, 4, 5]
# Reverse the list in-place
a.reverse()
print(a)
Using the reversed()
Python’s built-in reversed() function is another
way to reverse the list. However, reversed() returns an
iterator, so it needs to be converted back into a list.
a = [1, 2, 3, 4, 5]
# Use reversed() to create an iterator
# and convert it back to a list
rev = list(reversed(a))
print(rev)
unit 3 Python Packages by N.KARTHIKEYAN.pptx
unit 3 Python Packages by N.KARTHIKEYAN.pptx

unit 3 Python Packages by N.KARTHIKEYAN.pptx

  • 1.
  • 2.
    What is NumPy? •NumPy is a Python library used for working with arrays. • It also has functions for working in domain of linear algebra, fourier transform, and matrices. • NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely. • NumPy stands for Numerical Python.
  • 3.
    Why Use NumPy? •In Python we have lists that serve the purpose of arrays, but they are slow to process. • NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. • The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy. • Arrays are very frequently used in data science, where speed and resources are very important.
  • 4.
    Why is NumPyFaster Than Lists? • NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently. • This behavior is called locality of reference in computer science. • This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.
  • 5.
    Vectorization in NumPywith Practical Examples • Vectorization in NumPy is a method of performing operations on entire arrays without explicit loops. This approach leverages NumPy’s underlying C implementation for faster and more efficient computations. • By replacing iterative processes with vectorized functions, you can significantly optimize performance in data analysis, machine learning, and scientific computing tasks.
  • 6.
    import numpy asnp a1 = np.array([2,4,6,8,10 ]) number= 2 result = a1 + number print(result) Output [ 4 6 8 10 12]
  • 7.
    Vectorization is significantbecause it: • Improves Performance: Operations are faster due to pre-compiled C-based implementations. • Simplifies Code: Eliminates explicit loops, making code cleaner and easier to read. • Supports Scalability: Efficiently handles large datasets.
  • 8.
    Adding two arraystogether with vectorization import numpy as np a1 = np.array([1, 2, 3]) a2 = np.array([4, 5, 6]) result = a1 + a2 print(result) Output[5 7 9]
  • 9.
    Element-Wise Multiplication witharray import numpy as np a1 = np.array([1, 2, 3, 4]) result = a1 * 2 print(result) Output[2 4 6 8]
  • 10.
    Logical Operations onArrays import numpy as np a1 = np.array([10, 20, 30]) result = a1 > 15 print(result) Output[False True True]
  • 11.
    Matrix Operations UsingVectorization import numpy as np a1= np.array([[1, 2], [3, 4]]) a2 = np.array([[5, 6], [7, 8]]) result = np.dot(a1, a2) print(result) Output[[19 22] [43 50]]
  • 12.
    Applying Custom Functionswith Numpy Vectorize() Function import numpy as np def custom_func(x): return x**2 + 2*x + 1 a1 = np.array([1, 2, 3, 4]) result = custom_func(a1) print(result)
  • 13.
    PANDAS • Pandas isone of the most used libraries in Python for data science or data analysis. It can read data from CSV or Excel files, manipulate the data, and generate insights from it. Pandas can also be used to clean data, filter data, and visualize data.
  • 14.
    List of ImportantPandas Functions • Pandas read_csv() Function • Pandas head() Function • Pandas tail() Function • Pandas sample() Function:This method is used to generate a sample random row or column from the data frame. • Pandas info() Function:This method is used to generate the summary of the DataFrame, this will include info about columns with their names, their datatypes, and missing values. • Pandas dtypes() Function:This method returns a Series with the data type of each column.
  • 15.
    Python Pandas Series •Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). # import pandas as pd import pandas as pd # simple array data = [1, 2, 3, 4] ser = pd.Series(data) print(ser) Output 0 1 1 2 2 3 3 4 dtype: int64
  • 16.
    Creating a PandasSeries # import pandas as pd import pandas as pd # import numpy as np import numpy as np # simple array data = np.array(['g','e','e','k','s']) ser = pd.Series(data) print(ser) Output 0 g 1 e 2 e 3 k 4 s dtype: object
  • 17.
    Pandas Data Frames •import pandas as pd data = { "calories": [420, 380, 390], "duration": [50, 40, 45] } #load data into a DataFrame object: df = pd.DataFrame(data) print(df) calories duration 0 420 50 1 380 40 2 390 45 calories duration 0 420 50 1 380 40 2 390 45 calories duration 0 420 50 1 380 40 2 390 45
  • 18.
    Calories duration 0 42050 1 380 40 2 390 45
  • 19.
    Indexing And Slicing •Indexing is the process of accessing an element in a sequence using its position in the sequence (its index). • In Python, indexing starts from 0, which means the first element in a sequence is at position 0, the second element is at position 1, and so on.
  • 20.
    • my_list =['apple', 'banana', 'cherry', 'date'] print(my_list[0]) # output: 'apple' print(my_list[1]) # output: 'banana'
  • 21.
    • Slicing inPython • Slicing is the process of accessing a sub- sequence of a sequence by specifying a starting and ending index. In Python, you perform slicing using the colon : operator. • sequence[start_index:end_index]
  • 22.
    • my_list =['apple', 'banana', 'cherry', 'date'] print(my_list[1:3]) # output: ['banana', 'cherry‘] • my_list = ['apple', 'banana', 'cherry', 'date'] print(my_list[:2]) # output: ['apple', 'banana'] print(my_list[2:]) # output: ['cherry', 'date']
  • 23.
    • In thefirst line of the above code, we have used slicing to get all the elements from the beginning of my_list up to (but not including) the element at index 2. In the second line, we have used slicing to get all the elements from index 2 to the end of my_list.
  • 24.
    • numbers =[1, 2, 3, 4, 5, 6, 7, 8, 9] odd_numbers = numbers[::2] print(odd_numbers) # output: [1, 3, 5, 7, 9] • The ::2 slice means that we are selecting every other element starting from the first element, which correspond to the odd numbers in the list
  • 25.
    • The ::2slice means that we are selecting every other element starting from the first element, which correspond to the odd numbers in the list
  • 26.
    Concatenation • Concatenation isdone by + operator. Concatenation is supported by sequence data types(string, list, tuple). Concatenation is done between the same data types only.
  • 27.
    • s1="Welcome" s2="to"s3="python" s4=s1+s2+s3 print (s4)#Output:Welcometopython • s=1+2 print (s)#Output:3 print (type(s))#Output:<class 'int'> • s1='1'+'2' print (s1)#Output:12 print (type(s1))#Output:<class 'str'>
  • 28.
    • t1=(1,2) • t2=(3,4) •print (t1+t2) • #Output:(1, 2, 3, 4)
  • 29.
    Repetition • Sequences datatypes(both mutable and immutable) support repetition operator * The repetition operator * will make multiple copies of that particular object and combines them together. When * is used with an integer it performs multiplication but with list, tuple or strings it performs a repetition
  • 30.
    • s1="python" print(s1*3) #Output:pythonpythonpython • l1=[1,2,3] • print (l1 * 3) • #Output:[1, 2, 3, 1, 2, 3, 1, 2, 3]
  • 31.
    Add/Remove Set Items •Add items to set: • If you want to add a single item to the set use the add() method. cities = {"Tokyo", "Madrid", "Berlin", "Delhi"} cities.add("Helsinki") print(cities) Output: {'Tokyo', 'Helsinki', 'Madrid', 'Berlin', 'Delhi'}
  • 32.
    • cities ={"Tokyo", "Madrid", "Berlin", "Delhi"} • cities2 = {"Helsinki", "Warsaw", "Seoul"} • cities.update(cities2) • print(cities) • Output: • {'Seoul', 'Berlin', 'Delhi', 'Tokyo', 'Warsaw', 'Helsinki', 'Madrid'}
  • 33.
    Remove items fromset: • We can use remove() and discard() methods to remove items form list. Example 1: cities = {"Tokyo", "Madrid", "Berlin", "Delhi"} cities.remove("Tokyo") print(cities)
  • 34.
    pop(): This method removesthe last item of the set but the catch is that we don’t know which item gets popped as sets are unordered. However, you can access the popped item if you assign the pop() method to a variable. Example: cities = {"Tokyo", "Madrid", "Berlin", "Delhi"} item = cities.pop() print(cities) print(item) Output: {'Tokyo', 'Delhi', 'Berlin'} Madrid
  • 35.
    del: del is nota method, rather it is a keyword which deletes the set entirely. Example: cities = {"Tokyo", "Madrid", "Berlin", "Delhi"} del cities print(cities)
  • 36.
    clear(): This method clearsall items in the set and prints an empty set. Example: cities = {"Tokyo", "Madrid", "Berlin", "Delhi"} cities.clear() print(cities)
  • 37.
    Sorted() Method This isa pre-defined method in python which sorts any kind of object. Syntax:sorted(iterable, key, reverse) In this method, we pass 3 parameters, out of which 2 (key and reverse) are optional and the first parameter i.e. iterable can be any iterable object This method returns a sorted list but does not change the original data structure.
  • 38.
    # List list_of_items =['g', 'e', 'e', 'k', 's'] print(sorted(list_of_items)) # Tuple tuple_of_items = ('g', 'e', 'e', 'k', 's') print(sorted(tuple_of_items)) # String-sorted based on ASCII # translations string = "geeks" print(sorted(string)) # Dictionary dictionary = {'g': 1, 'e': 2, 'k': 3, 's': 4} print(sorted(dictionary)) # Set set_of_values = {'g', 'e', 'e', 'k', 's'} print(sorted(set_of_values)) # Frozen Set frozen_set = frozenset(('g', 'e', 'e', 'k', 's')) print(sorted(frozen_set))
  • 39.
    • The reverse()method reverses the elements of the list in-place and it modify the original list without creating a new list. This method is efficient because it doesn’t create a new list. a = [1, 2, 3, 4, 5] # Reverse the list in-place a.reverse() print(a)
  • 40.
    Using the reversed() Python’sbuilt-in reversed() function is another way to reverse the list. However, reversed() returns an iterator, so it needs to be converted back into a list. a = [1, 2, 3, 4, 5] # Use reversed() to create an iterator # and convert it back to a list rev = list(reversed(a)) print(rev)