What is wrong with the below code?

1

I have been working on a project which i took from kaggle. I didn't get the result as mentioned in the website. What am I doing wrong here?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

train = pd.read_csv('../data/dont-overfit/train.csv')
test = pd.read_csv('../data/dont-overfit/test.csv')

train[train.columns[2:]].std().plot('hist')

train[train.columns[2:]].mean().plot('hist')
plt.title('Distribution of means/stds of all columns')

plt.show()

print(train.isnull().any().any())

print('Distribution of first 28 columns')
plt.figure(figsize=(26,24))
for i, col in enumerate(list(train)[2:30]):
    plt.subplot(7, 4, i+1)
    plt.hist(train[col])
    plt.title(col)

plt.show()

Here is the post

Alexpandiyan Chokkan

Posted 2019-08-01T06:05:12.567

Reputation: 111

Question was closed 2019-08-01T15:02:35.200

The link you provided is not working. – Ankit Seth – 2019-08-01T06:16:34.113

@AnkitSeth updated the link – Alexpandiyan Chokkan – 2019-08-01T06:18:50.137

You are trying to plot the distribution of first 28 column and it doesn't match with result as mentioned in website. Code looks okay to me. Is there any change in dataset you have download. – SUN – 2019-08-01T07:30:41.843

Answers

1

the code seems ok plotting the distribution. let me know the exact error.

enter image description here

P K

Posted 2019-08-01T06:05:12.567

Reputation: 36