ValueError: array length 13996092 does not match index length 214200 for Kaggle submission

0

My model is based on the model linked here on Kaggle.

Specifically, the following lines of code to generate an uploadable table to make my submission:

submission_pfs = my_model.predict(X_test)
# we will keep every value between 0 and 20
submission_pfs = submission_pfs.clip(0,20)
# creating dataframe with required columns 
submission = pd.DataFrame({'ID':test_data['ID'],'item_cnt_month':submission_pfs.ravel()})
# creating csv file from dataframe
submission.to_csv('sub_pfs.csv',index = False)

But I get the following error and I am not sure why:

ValueError: array length 13996092 does not match index length 214200

X_test is all my test data (a time series of values). The dimensions of numpy being (424124, 33, 1)

test_data is the test dataset, (just 3 columns with all the ids (product ID, shop ID and entry ID) which has a shape of 214200 rows × 3 columns

The shape of submission_pfs is (424124, 33, 1) as well

The 214200 comes from test_data. While 13996092 comes from multiplication 424124*33

I tried:

submission = pd.DataFrame({'ID':X_test['ID'],'item_cnt_month':submission_pfs.ravel()})

But that gives:

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

While:

submission = pd.DataFrame({'ID':X_test,'item_cnt_month':submission_pfs.ravel()})

gives:

Exception: Data must be 1-dimensional

I am not sure what I am doing wrong and how I can fix this

Mr. Johnny Doe

Posted 2021-01-19T07:43:26.530

Reputation: 1

No answers