optimization - Using optimize() in R for the first time - am I doing this right? - Stack Overflow

IT技术

更新时间：2025-01-088

admin管理员组
文章数量:1122846

I'm outside my domain expertise and even though I've created an 'answer' (with lots of googling and ChatGPT) I need someone to let me know if I'm doing what I intended to do.

I'm following a tutorial on Creating an Expected Points Model Inspired by Pythagorean Expectation. He uses a solver from Excel to find the optimal exponent coefficient c by looking for the lowest mse. It is my understanding that optimize() does a similar job.

I have come up with the following function but don't understand how the values are getting passed to optimize() and why it looks like the function produces three values predicted_xPts_perc, se, and mean(se). Shouldn't it provide two?

# Objective function
objective_function <- function(c) {
  predicted_xPts_perc <- data$xg_for_per_match^c / (data$xg_for_per_match^c + data$xg_against_per_match^c)
  se <- (predicted_xPts_perc - data$actual_points_perc)^2
  mean(se)
}

# Optimize the objective
result <- optimize(f = objective_function, interval = c(0, 5), lower = 1, upper = 3)
optimal_c <- result$minimum
print(optimal_c)

Here is the data (sample) and full code block:

data <- structure(list(xg_for_per_match = c(2.431, 2.256, 2.399, 1.86777777777778, 
                                    2.56, 1.776, 1.59222222222222, 1.665, 0.907, 1.691, 1.52555555555556, 
                                    0.56, 1.46555555555556, 0.754, 1.955, 1.285, 1.38428571428571, 
                                    1.282, 1.86625, 1.5075), xg_against_per_match = c(0.757, 0.682, 
                                                                                      0.739, 1.40555555555556, 0.849, 1.409, 1.17777777777778, 1.822, 
                                                                                      1.938, 2.489, 1.70666666666667, 3.136, 2.37555555555556, 2.985, 
                                                                                      0.78375, 1.56, 1.43, 0.812, 0.945, 0.94875), actual_points_perc = c(0.866666666666667, 
                                                                                                                                                          0.766666666666667, 0.733333333333333, 0.666666666666667, 0.6, 
                                                                                                                                                          0.6, 0.592592592592593, 0.566666666666667, 0.4, 0.4, 0.37037037037037, 
                                                                                                                                                          0.2, 0.111111111111111, 0, 0.791666666666667, 0.625, 0.619047619047619, 
                                                                                                                                                          0.6, 0.541666666666667, 0.416666666666667), se = c(0.0020194397996264, 
                                                                                                                                                                                                             0.0223794165299103, 0.0323996328399904, 0.000796304748754376, 
                                                                                                                                                                                                             0.0905484326863811, 0.00018817673840017, 0.00288909832711458, 
                                                                                                                                                                                                             0.0124545498241967, 0.0485423096732081, 0.0070889390780061, 0.0054423086667482, 
                                                                                                                                                                                                             0.0285940157162698, 0.027082829359042, 0.00359736139900119, 0.00488179512531288, 
                                                                                                                                                                                                             0.048737642866935, 0.0183025693952632, 0.0129244422758347, 0.0646460987510766, 
                                                                                                                                                                                                             0.0897732500472823)), row.names = c(NA, -20L), class = c("tbl_df", 
                                                                                                                                                                                                                                                                      "tbl", "data.frame"))

# Objective function
objective_function <- function(c) {
  predicted_xPts_perc <- data$xg_for_per_match^c / (data$xg_for_per_match^c + data$xg_against_per_match^c)
  se <- (predicted_xPts_perc - data$actual_points_perc)^2
  mean(se)
}

# Optimize the objective
result <- optimize(f = objective_function, interval = c(0, 5), lower = 1, upper = 3)
optimal_c <- result$minimum
print(optimal_c)

I'm outside my domain expertise and even though I've created an 'answer' (with lots of googling and ChatGPT) I need someone to let me know if I'm doing what I intended to do.

I'm following a tutorial on Creating an Expected Points Model Inspired by Pythagorean Expectation. He uses a solver from Excel to find the optimal exponent coefficient c by looking for the lowest mse. It is my understanding that optimize() does a similar job.

I have come up with the following function but don't understand how the values are getting passed to optimize() and why it looks like the function produces three values predicted_xPts_perc, se, and mean(se). Shouldn't it provide two?

# Objective function
objective_function <- function(c) {
  predicted_xPts_perc <- data$xg_for_per_match^c / (data$xg_for_per_match^c + data$xg_against_per_match^c)
  se <- (predicted_xPts_perc - data$actual_points_perc)^2
  mean(se)
}

# Optimize the objective
result <- optimize(f = objective_function, interval = c(0, 5), lower = 1, upper = 3)
optimal_c <- result$minimum
print(optimal_c)

Here is the data (sample) and full code block:

data <- structure(list(xg_for_per_match = c(2.431, 2.256, 2.399, 1.86777777777778, 
                                    2.56, 1.776, 1.59222222222222, 1.665, 0.907, 1.691, 1.52555555555556, 
                                    0.56, 1.46555555555556, 0.754, 1.955, 1.285, 1.38428571428571, 
                                    1.282, 1.86625, 1.5075), xg_against_per_match = c(0.757, 0.682, 
                                                                                      0.739, 1.40555555555556, 0.849, 1.409, 1.17777777777778, 1.822, 
                                                                                      1.938, 2.489, 1.70666666666667, 3.136, 2.37555555555556, 2.985, 
                                                                                      0.78375, 1.56, 1.43, 0.812, 0.945, 0.94875), actual_points_perc = c(0.866666666666667, 
                                                                                                                                                          0.766666666666667, 0.733333333333333, 0.666666666666667, 0.6, 
                                                                                                                                                          0.6, 0.592592592592593, 0.566666666666667, 0.4, 0.4, 0.37037037037037, 
                                                                                                                                                          0.2, 0.111111111111111, 0, 0.791666666666667, 0.625, 0.619047619047619, 
                                                                                                                                                          0.6, 0.541666666666667, 0.416666666666667), se = c(0.0020194397996264, 
                                                                                                                                                                                                             0.0223794165299103, 0.0323996328399904, 0.000796304748754376, 
                                                                                                                                                                                                             0.0905484326863811, 0.00018817673840017, 0.00288909832711458, 
                                                                                                                                                                                                             0.0124545498241967, 0.0485423096732081, 0.0070889390780061, 0.0054423086667482, 
                                                                                                                                                                                                             0.0285940157162698, 0.027082829359042, 0.00359736139900119, 0.00488179512531288, 
                                                                                                                                                                                                             0.048737642866935, 0.0183025693952632, 0.0129244422758347, 0.0646460987510766, 
                                                                                                                                                                                                             0.0897732500472823)), row.names = c(NA, -20L), class = c("tbl_df", 
                                                                                                                                                                                                                                                                      "tbl", "data.frame"))

# Objective function
objective_function <- function(c) {
  predicted_xPts_perc <- data$xg_for_per_match^c / (data$xg_for_per_match^c + data$xg_against_per_match^c)
  se <- (predicted_xPts_perc - data$actual_points_perc)^2
  mean(se)
}

# Optimize the objective
result <- optimize(f = objective_function, interval = c(0, 5), lower = 1, upper = 3)
optimal_c <- result$minimum
print(optimal_c)

Share Improve this question asked Nov 22, 2024 at 1:50 seansteele 6813 silver badges14 bronze badges

1 everything you did is correct – Onyambu Commented Nov 22, 2024 at 2:23

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

That does work, but R has the nls() function for performing non-linear least square fits.

model <- nls(actual_points_perc ~ xg_for_per_match^c / (xg_for_per_match^c + xg_against_per_match^c), data=data, start=list(c=1))

summary(model)
# Formula: actual_points_perc ~ xg_for_per_match^c/(xg_for_per_match^c + 
#                                                      xg_against_per_match^c)
# 
# Parameters:
#    Estimate Std. Error t value Pr(>|t|)    
# c   1.0249     0.1935   5.297 4.11e-05 ***
#    ---
#    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.1242 on 19 degrees of freedom
# 
# Number of iterations to convergence: 2 
# Achieved convergence tolerance: 9.647e-06

本文标签： optimizationUsing optimize() in R for the first timeam I doing this rightStack Overflow

版权声明：本文标题：optimization - Using optimize() in R for the first time - am I doing this right? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736306157a1932853.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

optimization - Using optimize() in R for the first time - am I doing this right? - Stack Overflow

1 Answer 1

更多相关文章