ValueError: array length 13996092 does not match index length 214200 for Kaggle submission


My model is based on the model linked here on Kaggle.

Specifically, the following lines of code to generate an uploadable table to make my submission:

submission_pfs = my_model.predict(X_test)
# we will keep every value between 0 and 20
submission_pfs = submission_pfs.clip(0,20)
# creating dataframe with required columns 
submission = pd.DataFrame({'ID':test_data['ID'],'item_cnt_month':submission_pfs.ravel()})
# creating csv file from dataframe
submission.to_csv('sub_pfs.csv',index = False)

But I get the following error and I am not sure why:

ValueError: array length 13996092 does not match index length 214200

X_test is all my test data (a time series of values). The dimensions of numpy being (424124, 33, 1)

test_data is the test dataset, (just 3 columns with all the ids (product ID, shop ID and entry ID) which has a shape of 214200 rows × 3 columns

The shape of submission_pfs is (424124, 33, 1) as well

The 214200 comes from test_data. While 13996092 comes from multiplication 424124*33

I tried:

submission = pd.DataFrame({'ID':X_test['ID'],'item_cnt_month':submission_pfs.ravel()})

But that gives:

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices


submission = pd.DataFrame({'ID':X_test,'item_cnt_month':submission_pfs.ravel()})


Exception: Data must be 1-dimensional

I am not sure what I am doing wrong and how I can fix this

Mr. Johnny Doe

Posted 2021-01-19T07:43:26.530

Reputation: 1

No answers