I have obtained a scatter plot for a given X and Y. There are multiple Y values for some X values. I want to make the result into an array.
I basically want to find the mean of each column (mean of all Y values for a given X value) and plot it. Here's the data for the X and Y vectors that I have-

Here's my code -
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv('csv_file.csv')
dataset = dataset.iloc[:, 1:3]
dataset = dataset.sort_values(by=dataset.columns[1])
X = dataset.iloc[:, 1].values
X = X.reshape(len(X), 1)
y = dataset.iloc[:, 0].values
y = y.reshape(len(y), 1)
plt.scatter(y, X, color='pink', label='data')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Data plotting')
plt.legend()
plt.show()