nlp - Understanding the Difference Between Entropy and Cross-Entropy in Language Models: Practical Example with Character-Level

IT技术

更新时间：2025-03-060

admin管理员组
文章数量:1278985

I'm trying to understand the difference between entropy and cross-entropy, as I often hear about the entropy of a language and the cross-entropy of a language model, and I want to understand the link between the two.

To simplify things, let's consider a language (with a vocabulary) and a language model trained on that language.

We'll work at the character level (which gives us 26 characters), and a limited number of words (let's take the 20 names below).

prenoms = [
    "Alice", "Alfred", "Alina", "Aline", "Alexandre", 
    "Alicia", "Alison", "Alma", "Alva", "Elise", 
    "Elisa", "Eliane", "Alain", "Amélie", "Arline", 
    "Olivier", "Oline", "Alva", "Eliott", "Julien"
]

How do we calculate the entropy over these 20 names (i.e., the entropy of our language) and the cross-entropy for our language model (let's take a unigram model or any language model you prefer to help me to understand)?

If you have a more relevant example, I’m open to it.

PS: My confusion comes from the fact that, in general definitions, we talk about a (language) distribution P when calculating entropy (without quite knowing how to calculate it), and about two distributions P and Q when calculating cross-entropy (where P is a one-hot encoding vector in this case, when calculating cross-entropy-loss).

PS2: A python code could help me to well understand, here is my understanding. I based it on y understanding of Jurasky Book (and )

def distribution_ngrams(text, n=4):
    """
    """
    import math
    from collections import Counter 

    ngrams = [text[i:i+n] for i in range(len(text)-n+1)]
    counts = Counter(ngrams)
    total = len(ngrams)
    
    # Calculate the distribution of n-grams
    distribution = {ngram: count/total for ngram, count in counts.items()}
    return distribution

def language_entropy_ngrams(text, n_approx=4):
    """
    Calculate an estimate of the entropy of a text using n-grams (normally, we take a very large n and consider an infinite sequence L)
    """
    import math
    distribution = distribution_ngrams(text, n_approx)
    # Calculate entropy
    entropy = -sum((p * math.log2(p)) for ngram,p in distribution.items())
    entropy_rate = entropy / n_approx  # normalize by the size of the n-gram
    return entropy_rate  

def model_cross_entropy(text,n_approx=4):
    """
    Calculate the cross-entropy between the true text and the model's predictions
    """
    import math
    unigram_model_distribution =  distribution_ngrams(text, 1)
    language_model_distribution_approximation = distribution_ngrams(text, n_approx)

    q = {}
    cross_entropy = 0
    for ngram,p in language_model_distribution_approximation.items():
        q[ngram] = 1
        for c in ngram:
            q[ngram] = q[ngram]*unigram_model_distribution[c]
        cross_entropy -= p*math.log2(q[ngram])
        
    return cross_entropy/n_approx

if __name__ == "__main__":
    prenoms = ["Alice", "Alfred", "Alina", "Aline", "Alexandre", "Alicia", 
            "Alison", "Alma", "Alva", "Elise", "Elisa", "Eliane", "Alain", 
            "Amélie", "Arline", "Olivier", "Oline", "Alva", "Eliott", "Julien"] #each prenonm can be seen as a sequence of characters


    L = ''.join(prenoms).lower() #the corpus/language L can be seen as the concatenation of the sequences
    print(language_entropy_ngrams(L))
    print(model_cross_entropy(L))

本文标签：

版权声明：本文标题：nlp - Understanding the Difference Between Entropy and Cross-Entropy in Language Models: Practical Example with Character-Level 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741253149a2366167.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

Why iPerf3 bitrate difference between single stream and multiples - Stack Overflow

IT技术

29分钟前

Why do I get such a difference in Bitrate with iPerf when running a single connection compared to runni

javascript - Image onclick not working - Stack Overflow

IT技术

29分钟前

I have an image which is set relative to a background image.I want to call a function on the click but

javascript - How to get the id of all the child node UL tags in a UL - Stack Overflow

IT技术

26分钟前

I know this is a pretty basic question I just seem to be having some issues doing it. I have a HTML str

javascript - Convert JSON Serialized String Containing HTML Entities into object - Stack Overflow

IT技术

25分钟前

I have a string that looks like this:"["Software","3rd Party"]

javascript - I need an event trigger for a radio button for when it is unchecked because another button is checked - Stack Overf

IT技术

24分钟前

I need an event trigger for a radio button for when it is unchecked because another button is checked.T

javascript - Modiying request headers in an XMLHttpRequest - Stack Overflow

IT技术

21分钟前

I've made an ajax call interceptor to listen for XMLHttpRequests.I would like to modify the reques

how to get html select to show selected value with javascript - Stack Overflow

IT技术

21分钟前

I have a select dropdownlist with 1 item selected at page load in html.<select name = "options&

Is there a better way of making Python aware of module sources? - Stack Overflow

IT技术

19分钟前

I have a Python package that is supposed to be used interactively, mostly from a debugger. (It is inden

javascript - Splitting Styled-Components into multiple files but export them as one file - Stack Overflow

IT技术

17分钟前

When I was building my application, I didn't realize that I am going to end up with more than a hu

tkinter - Why is the text placement different in 1755x1241 as compared to 900x650 in Python Canvas? - Stack Overflow

IT技术

12分钟前

I am creating a GUI application that generates certificate based on the user input.The problem arising

javascript - Querying association tables in Sequelize - Stack Overflow

IT技术

11分钟前

I have two tables (users and games) joined by an association table (game_players), creating a many-to-m

javascript - Failed to load resource under Chrome! Not work ajax currently - Stack Overflow

IT技术

8分钟前

I am using this ajax code for checking domains. For each domain, a request is sent to API. I create 200

javascript - How to check for first letter should be uppercase in JS regex - Stack Overflow

IT技术

7分钟前

I am learning javascript and Regexand trying to validate a input if the first letter in the string is

javascript - Can't get componentDidUpdate() stop looping - Stack Overflow

IT技术

7分钟前

I'm trying to fetch data from a public API with Axios, and display what I get through a React app.

Use Blazor for generating static serverless website (without .NET server) - Stack Overflow

IT技术

7分钟前

I want to create a website and I appreciate the framework Blazor. Usually Blazor is used in .NET with s

javascript - Make required property accept null, but not undefined with Vue.js - Stack Overflow

IT技术

5分钟前

I'd like to accept objects and null (Vue.js checks for null values for objects, even though typeof

javascript - Logs in the console stuck saying [12:06:59 PM] File change detected. Starting incremental compilation in nestjs - S

IT技术

3分钟前

When I start the nest application, then it successfully starts and shows the logs given below.I'v

javascript - Next.js React component getInitialProps doesn't bind props - Stack Overflow

IT技术

3分钟前

In Next.js, you can use in your React ponents a lifecycle method bound to the server side rendering on

javascript - Distance (in px) between two elements in the DOM - Stack Overflow

IT技术

3分钟前

How can I get the distance between two elements in the DOM?I am thinking on using getBoundingClientRect

javascript - add Access-Control-Allow-Origin on http_listener in C++ REST SDK - Stack Overflow

IT技术

54秒前

I am running an HTTP server with web::http::experimental::listener::http_listener from Microsoft C++ RE

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

nlp - Understanding the Difference Between Entropy and Cross-Entropy in Language Models: Practical Example with Character-Level

更多相关文章

Why iPerf3 bitrate difference between single stream and multiples - Stack Overflow

javascript - Image onclick not working - Stack Overflow

javascript - How to get the id of all the child node UL tags in a UL - Stack Overflow

javascript - Convert JSON Serialized String Containing HTML Entities into object - Stack Overflow

javascript - I need an event trigger for a radio button for when it is unchecked because another button is checked - Stack Overf

javascript - Modiying request headers in an XMLHttpRequest - Stack Overflow

how to get html select to show selected value with javascript - Stack Overflow

Is there a better way of making Python aware of module sources? - Stack Overflow

javascript - Splitting Styled-Components into multiple files but export them as one file - Stack Overflow

tkinter - Why is the text placement different in 1755x1241 as compared to 900x650 in Python Canvas? - Stack Overflow

javascript - Querying association tables in Sequelize - Stack Overflow

javascript - Failed to load resource under Chrome! Not work ajax currently - Stack Overflow

javascript - How to check for first letter should be uppercase in JS regex - Stack Overflow

javascript - Can&#39;t get componentDidUpdate() stop looping - Stack Overflow

Use Blazor for generating static serverless website (without .NET server) - Stack Overflow

javascript - Make required property accept null, but not undefined with Vue.js - Stack Overflow

javascript - Logs in the console stuck saying [12:06:59 PM] File change detected. Starting incremental compilation in nestjs - S

javascript - Next.js React component getInitialProps doesn&#39;t bind props - Stack Overflow

javascript - Distance (in px) between two elements in the DOM - Stack Overflow

javascript - add Access-Control-Allow-Origin on http_listener in C++ REST SDK - Stack Overflow

发表评论

推荐文章

plugin development - I need to combine Pages Listing and Main Menu from frontend in Wordpress

javascript - Decode Jwt Token in Node - without Library - Stack Overflow

Making custom post type visible for only logged in users

Javascript: Configuration Pattern - Stack Overflow

javascript - Why does the NVD3.js line plus bar chart example render two charts instead of one? - Stack Overflow

热门文章

How to query for most viewed posts and show top 5

javascript - Laravel 6 css and js files not found ( net::ERR_ABORTED 404 (Not Found) ) - Stack Overflow

javascript - How can i disablecancel setTimeOut when i mouseover &amp;&amp; mouseout in specific areas? - Stack Overflow

amazon web services - How can I filter out rows with missing mandatory data and log them in AWS Glue without causing an error? -

custom post types - Create loop from selected terms in ACF taxonomy field

spectator.earth API error (http 500) in LuaU on roblox - Stack Overflow

maven - Tycho Update from 2.7.5 to 4.0.5 causing p2.index Getting Redownload Causes Artifact Resolution Failure - bundleLocation

sql - Effective batch ID search in Postgres JSON arrays - Stack Overflow

javascript - Proper technique to close an ExtJS tab - Stack Overflow

javascript - File size issue in Chrome for base64-encoded PDF data in OBJECT tag - Stack Overflow

最新文章

Win7各正式版下载地址和SHA验证

怎么样把中文版的Windows7改成英文版的Windows7

Win7系统笔记本蓝牙打开指南：详细步骤助你轻松连接

win7开机弹计算机,win7开机弹出Windows Installer窗口的解决方法

windows7虚拟机安装vmtools方法

javascript - add Access-Control-Allow-Origin on http_listener in C++ REST SDK - Stack Overflow

JavaScript: How would I reverse ONLY the words in a string - Stack Overflow

javascript - JSONP To Acquire JSON From HTTPS Protocol with JQuery - Stack Overflow

How to format multiple footnotes on a single sentence in Quarto reveal.js? - Stack Overflow

javascript - redux-persist - When to persist reducer? - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

javascript - Can't get componentDidUpdate() stop looping - Stack Overflow

javascript - Next.js React component getInitialProps doesn't bind props - Stack Overflow

javascript - How can i disablecancel setTimeOut when i mouseover && mouseout in specific areas? - Stack Overflow