admin管理员组文章数量:1122846
I’m working on a Python script to extract emails, dates, and phone numbers from a text file and save them into a table or CSV file without regex. While I'm able to extract the data successfully, I'm facing an issue where the extracted data isn't aligned correctly.
Minimal Reproducible Example:
Input File (text):
Hello, you can reach out to us at [email protected] or [email protected].
Our customer service is available 24/7. Call us at (123) 456-7890 or 987-654-3210.
Important Dates:
- Application Deadline: 12-08-2024
- Event Date: 2024/11/20
Issue: When extracting the data, the emails and dates are extracted fine, but the phone numbers don't align with their respective entries in the table. Specifically, the second row contains a date, but no phone number is assigned to it, even though the phone number appears earlier in the text.
What I've Tried: I wrote functions to extract emails, dates, and phone numbers using basic string methods. I also tried aligning the extracted data, but some phone numbers are not properly assigned to the correct rows.
Code Snippet:
def extract_phone_numbers(text):
phone_numbers = []
for word in text.split():
clean_word = word.strip(",.()")
if clean_word.isdigit() and len(clean_word) in [10, 11]:
phone_numbers.append(clean_word)
elif "-" in clean_word:
parts = clean_word.split("-")
if all(part.isdigit() for part in parts) and len(clean_word.replace("-", "")) in [10, 11]:
phone_numbers.append(clean_word)
elif clean_word.startswith("(") and ")" in clean_word:
phone_numbers.append(clean_word)
return phone_numbers
Current Output :
+---------------------+------------+-----------------+
| Email Address | Date | Phone Number |
+---------------------+------------+-----------------+
| [email protected] | 12-08-2024 | 987-654-3210 |
| [email protected] | 2024/11/20 | |
+---------------------+------------+-----------------+
And expected is:
+---------------------+------------+-----------------+
| Email Address | Date | Phone Number |
+---------------------+------------+-----------------+
| [email protected] | 12-08-2024 | (123) 456-7890 |
| [email protected] | 2024/11/20 | 987-654-3210 |
+---------------------+------------+-----------------+
本文标签:
版权声明:本文标题:extract - Aligning Extracted Emails, Dates, and Phone Numbers from Text File in Python - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736311558a1934781.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论