admin管理员组

文章数量:1208153

New to python here, and have a series of questions on a Keras Sequential model build, getting different means, depending on how I calculate them.

Pasting below the relevant code, and questions follow.

def regression_model():
    model = Sequential() # build an instance of a model
    model.add(Dense(10, activation = 'relu', input_shape=(n_cols,))) 
    model.add(Dense(1)) # Output layer with 1 node
    modelpile(optimizer='adam', loss = 'mean_squared_error', metrics=['accuracy'])        return model  

X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.3) model = regression_model() # call the function to create a model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs = 50)

scores = model.evaluate(X_test, y_test, verbose=0)
_y_test = model.predict(X_test)

from sklearn.metrics import mean_squared_error
mean_square_error = mean_squared_error(y_test.values, _y_test)

print("Mean Squared error by formula: ", mean_square_error)

mean1 = np.mean(mean_square_error)
print("Mean of mean Squared Error by scikit: ", mean1)

mean2=np.mean((y_test.values - _y_test) ** 2)
print("Mean of mean Squared Error by formula: ", mean2)

standard_deviation = np.std(mean_square_error)

Key variable outputs:

mean: 882.13

mean_square_error: 119.21

mean1: 119.21

mean2: 516.47

standard_deviation: 0

Questions:

  1. Why are mean1 and mean2 different
  2. Why is "mean" different from mean1 or mean2
  3. Why is std dev 0
  4. While fitting data with the model, why is accuracy always listed as 0.0000e+00? Loss iterations look fine.

I was expecting mean1 and mean2 to be the same, and there to be a non-zero standard deviation.

New to python here, and have a series of questions on a Keras Sequential model build, getting different means, depending on how I calculate them.

Pasting below the relevant code, and questions follow.

def regression_model():
    model = Sequential() # build an instance of a model
    model.add(Dense(10, activation = 'relu', input_shape=(n_cols,))) 
    model.add(Dense(1)) # Output layer with 1 node
    model.compile(optimizer='adam', loss = 'mean_squared_error', metrics=['accuracy'])        return model  

X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.3) model = regression_model() # call the function to create a model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs = 50)

scores = model.evaluate(X_test, y_test, verbose=0)
_y_test = model.predict(X_test)

from sklearn.metrics import mean_squared_error
mean_square_error = mean_squared_error(y_test.values, _y_test)

print("Mean Squared error by formula: ", mean_square_error)

mean1 = np.mean(mean_square_error)
print("Mean of mean Squared Error by scikit: ", mean1)

mean2=np.mean((y_test.values - _y_test) ** 2)
print("Mean of mean Squared Error by formula: ", mean2)

standard_deviation = np.std(mean_square_error)

Key variable outputs:

mean: 882.13

mean_square_error: 119.21

mean1: 119.21

mean2: 516.47

standard_deviation: 0

Questions:

  1. Why are mean1 and mean2 different
  2. Why is "mean" different from mean1 or mean2
  3. Why is std dev 0
  4. While fitting data with the model, why is accuracy always listed as 0.0000e+00? Loss iterations look fine.

I was expecting mean1 and mean2 to be the same, and there to be a non-zero standard deviation.

Share Improve this question edited Jan 19 at 22:51 pomoworko.com 1,1182 gold badges15 silver badges43 bronze badges asked Jan 19 at 16:24 nic dunaisnic dunais 91 bronze badge 0
Add a comment  | 

1 Answer 1

Reset to default 0
  1. mean_square_error is a single number, so mean1 takes the mean of that number, which returns the number. Note that in sklearn, the mean_squared_error method divides the mean squared error (MSE) with the number of data points, which is not done in the calculation in mean2. See documentation Here

  2. Where is mean evaluated? A mean squared is different from a mean.

  3. Again, as mean_square_error is a single number, the standard deviation of it is zero.

  4. accuracy rounds numbers of to integers, so it might not be an appropriate metric for regression. A continuous metric, such as MSE, would probably be a better choice.

本文标签: pythonNeed help understanding different ways to calculate mean error in regression modelsStack Overflow