Sure, here's the text translated into simplified Chinese while keeping the HTML structure: ```html 使用Google Gemini Flash从复杂的PDF中获取结构化数据 ``` In this HTML snippet: - `` is used to indicate inline text, which is suitable for short phrases or titles like the one provided. - The Chinese text "使用Google Gemini Flash从复杂的PDF中获取结构化数据" translates to "Getting Structured Data From Tricky PDFs With Google Gemini Flash".

Sure, here's how you can translate that English text into simplified Chinese while keeping the HTML structure: ```html 在本指南中,我们将探讨如何使用视觉语言模型(VLMs),如 Gemini Flash 或 GPT-4o,从 PDF 中提取结构化数据。 ``` This HTML structure maintains the text and format for web pages or documents where you need to display or use the translated content.

The results of the extraction for a Bill of Quantity PDF

```html

Gemini,谷歌最新的视觉语言模型系列,展示了在文本和图像理解方面的最新技术表现。这种改进的多模态能力和长上下文窗口使其特别适用于处理传统提取模型难以处理的视觉复杂的PDF数据,如图形、图表、表格和图表。

```

Sure, here's the translation in simplified Chinese, while keeping the HTML structure: ```html

通过这样做,您可以轻松构建自己的数据提取工具,用于视觉文件和网页提取。以下是具体步骤:

``` This HTML snippet translates the text "By doing so, you can easily build your own data extraction tool for visual file and web extraction. Here’s how:" into simplified Chinese.
Gemini’s long context window and multimodal capability makes it particularly useful for processing visually complex PDF data that traditional extraction models struggle with

Certainly! Here's the translation of "Setting Up Your Environment" in simplified Chinese, keeping the HTML structure: ```html

设置您的环境

```

Certainly! Here's the HTML structure with the translated text in simplified Chinese: ```html

在我们深入讨论数据提取之前,让我们先设置好开发环境。本指南假设您的系统已安装了 Python。如果没有安装,请从 https://www.python.org/downloads/ 下载并安装。

``` In simplified Chinese, the translated text is: "在我们深入讨论数据提取之前,让我们先设置好开发环境。本指南假设您的系统已安装了 Python。如果没有安装,请从 https://www.python.org/downloads/ 下载并安装。"

Sure, here's the translation of "Install Required Libraries" in simplified Chinese, keeping the HTML structure: ```html 安装所需库 ```

Sure, here's the translation of "Open your terminal or command prompt and run the following commands:" in simplified Chinese, while keeping the HTML structure: ```html

打开你的终端或命令提示符,并运行以下命令:

```
pip install git+https://github.com/emcf/thepipe
pip install pandas

Certainly! Here's the HTML structure with the translated text in simplified Chinese: ```html

对于初学Python的人来说,pip 是 Python 的包安装程序,以下命令将会下载并安装必要的库。

``` In this structure: - `

` denotes a paragraph tag in HTML. - The Chinese text inside the `

` tags translates to: "对于初学Python的人来说,pip 是 Python 的包安装程序,以下命令将会下载并安装必要的库。" This conveys the same meaning as the original English text.

Here is the translation of "Set Up Your API Key" to Simplified Chinese while keeping the HTML structure: ```html 设置您的 API 密钥 ``` In HTML context, you can use it like this: ```html

设置您的 API 密钥

``` Or in a button or a link: ```html ``` ```html 设置您的 API 密钥 ``` If you need to add any additional HTML structure or context, let me know!

Sure, here's the HTML structure with the translated text in simplified Chinese: ```html

要使用该管道,您需要一个API密钥。

``` This HTML snippet contains the translated text "要使用该管道,您需要一个API密钥。", which means "To use the pipe, you need an API key."

Certainly! Here's the translated text in simplified Chinese, keeping the HTML structure: ```html

免责声明:尽管 thepi.pe 是一个免费的开源工具,但 API 使用会产生成本,大约每个令牌 $0.00002。如果您想避免这些成本,请查阅 GitHub 上的本地设置说明。请注意,您仍需支付您选择的语言模型提供商的费用。

``` This HTML snippet maintains the structure while presenting the translated disclaimer in simplified Chinese.

Sure, here's how you can represent "Here’s how to get and set it up:" in HTML while indicating that the text should be in simplified Chinese: ```html

这里是如何获取并设置它的方法:

```

Certainly! Here's the HTML structure with the translated text in simplified Chinese: ```html

1. 访问 https://thepi.pe/platform/2

2. 创建一个账户或者登录

3. 在设置页面找到你的 API 密钥

``` This HTML structure will display the translated instructions in a clear and organized manner.

To keep the HTML structure intact while translating the text to simplified Chinese, you can use the following: ```html

现在,您需要将此设置为环境变量。这个过程因您的操作系统而异:

``` This HTML code will display the translated text in simplified Chinese while maintaining the basic structure for web display.
Copy the API key from the settings menu on thepi.pe Platorm

Sure, here is the translation of "For Windows" in simplified Chinese, while keeping the HTML structure intact: ```html 适用于Windows: ``` In this HTML snippet: - `` is used to mark up the text. - `lang="zh-CN"` specifies the language as simplified Chinese. - "适用于Windows:" translates to "For Windows:".

Sure, here is the translated text in simplified Chinese, while keeping the HTML structure: ```html

  1. 在开始菜单中搜索“环境变量”
  2. 点击“编辑系统环境变量”
  3. 点击“环境变量”按钮
  4. 在“用户变量”下,点击“新建”
  5. 将变量名称设置为THEPIPE_API_KEY,值设置为您的API密钥
  6. 点击“确定”以保存
``` This HTML will display the steps in a numbered list, formatted for clarity and readability.

Certainly! Here's the translation of "For macOS and Linux:" in simplified Chinese, while keeping the HTML structure intact: ```html

对于 macOS 和 Linux:

``` This HTML snippet renders the translated text within a paragraph (`

`) tag, maintaining the structure you requested.

Sure, here's the translation in simplified Chinese while keeping the HTML structure: ```html 打开你的终端,并将以下行添加到你的shell配置文件(例如,~/.bashrc或~/.zshrc)中: ``` In this translation: - "打开你的终端" means "Open your terminal". - "并将以下行添加到你的shell配置文件" means "and add the following line to your shell configuration file". - "~/.bashrc或~/.zshrc" remains the same as it's a file path and doesn't need translation.

export THEPIPE_API_KEY=your_api_key_here

Sure, here's the translation of "Then, reload your configuration:" in simplified Chinese while maintaining HTML structure: ```html 然后,重新加载您的配置: ```

source ~/.bashrc # or ~/.zshrc

```html

定义您的提取模式

```

Certainly! Here's the translated text in simplified Chinese, keeping the HTML structure intact: ```html

成功提取的关键在于为您想要提取的数据定义一个清晰的模式。假设我们要从一个清单文件中提取数据:

``` This HTML snippet maintains the structure while presenting the translated text in simplified Chinese.
An example of a page from the Bill of Quantity document. The data on each page is independent of the other pages, so we do our extraction “per page”. There are multiple pieces of data to extract per page, so we set multiple extractions to True

Certainly! Here's the translation of "Looking at the column names, we might want to extract a schema like this:" in simplified Chinese, while keeping the HTML structure intact: ```html 看着列名,我们可能想要提取这样的架构: ``` Feel free to use this in your HTML document!

schema = {
“item”: “string”,
“unit”: “string”,
“quantity”: “int”,
}
You can modify the schema to your liking on thepi.pe Platform. Clicking “View Schema” will give you a schema you can copy and paste for use with the Python API

Sure, here's the translated text in simplified Chinese while keeping the HTML structure: ```html 提取PDF中的数据 ``` This translates to "Extracting Data from PDFs" in English.

Certainly! Here's how you can represent that text in simplified Chinese within an HTML structure: ```html 现在,让我们使用 extract_from_file 从 PDF 中提取数据: ``` This HTML structure ensures that the translated text is presented clearly in Chinese, while maintaining the overall formatting for integration into web content.


from thepipe.extract import extract_from_file

results = extract_from_file(
“bill_of_quantity.pdf”,
schema=schema,
ai_model=”google/gemini-flash-1.5b”,
chunking_method="chunk_by_page"
)

Certainly! Here is the translated text in simplified Chinese while maintaining the HTML structure: ```html 这里,我们设置了chunking_method=”chunk_by_page”,因为我们希望将每一页单独发送给AI模型(PDF文件太大,无法一次性处理)。我们还设置了multiple_extractions=True,因为每一页PDF包含多行数据。以下是PDF的一页示例: ``` This HTML snippet now contains the translated text in simplified Chinese within a `` tag, preserving the structure as requested.

The results of the extraction for the Bill of Quantity PDF as viewed on thepi.pe Platform

Certainly! Here's the translated text in simplified Chinese, keeping the HTML structure: ```html 处理结果 ``` This translates to "Processing the Results" in English.

Certainly! Here's the translated text in simplified Chinese, keeping the HTML structure: ```html

提取结果以字典列表形式返回。我们可以处理这些结果来创建一个 pandas DataFrame:

``` In this HTML snippet: - `

` denotes a paragraph element, used here to encapsulate the translated text. - The Chinese text inside the `

` tags translates the provided English sentence into simplified Chinese.


import pandas as pd
df = pd.DataFrame(results)
# Display the first few rows of the DataFrame
print(df.head())

```html

这创建了一个包含所有提取信息的 DataFrame,包括文本内容和图形、表格等视觉元素的描述。

```

Sure, here is the translation of "Exporting to Different Formats" in simplified Chinese while keeping the HTML structure intact: ```html 导出到不同格式 ``` This HTML snippet preserves the structure and simply replaces the English text with its equivalent in simplified Chinese.

```html

现在我们已经将数据存储在一个DataFrame中,我们可以很容易地将其导出到各种格式。以下是一些选项:

```

Sure, here's the translation of "Exporting to Excel" in simplified Chinese while keeping the HTML structure: ```html 导出到Excel ```

df.to_excel(“extracted_research_data.xlsx”, index=False, sheet_name=”Research Data”)

Certainly! Here's how you can structure the HTML to display the translated text in simplified Chinese: ```html

这将创建一个名为“extracted_research_data.xlsx”的Excel文件,其中包含一个名为“Research Data”的工作表。参数index=False可防止将DataFrame的索引包含为单独的列。

``` This HTML snippet will render the translated text in simplified Chinese while maintaining the overall structure suitable for web display.

Sure, here's how you can write "Exporting to CSV" in simplified Chinese within an HTML structure: ```html 导出到CSV ``` This HTML snippet will display "导出到CSV" on the webpage, which means "Exporting to CSV" in simplified Chinese.

Certainly! Here is the translated text in simplified Chinese, while keeping the HTML structure intact: ```html 如果您更喜欢简单的格式,您可以导出为 CSV 文件: ``` This HTML snippet maintains the original structure while providing the translated text in simplified Chinese.

df.to_csv(“extracted_research_data.csv”, index=False)

Certainly! Here's the text translated into simplified Chinese, while keeping the HTML structure: ```html 这将创建一个 CSV 文件,可以在 Excel 或任何文本编辑器中打开。 ``` In this translation: - "这将创建一个 CSV 文件" means "This will create a CSV file." - "可以在 Excel 或任何文本编辑器中打开" means "that can be opened in Excel or any text editor."

Certainly! Here's how you can represent "Ending Notes" in simplified Chinese within an HTML structure: ```html 结尾注释 ``` This HTML snippet ensures that the text "Ending Notes" is displayed in simplified Chinese, marked by the `lang="zh-CN"` attribute indicating the language.

Certainly! Here's the translation in simplified Chinese, maintaining HTML structure: ```html

成功提取的关键在于定义清晰的模式架构并利用AI模型的多模态能力。

``` In this HTML snippet: - `

` represents a paragraph tag in HTML. - The Chinese text translates to: "成功提取的关键在于定义清晰的模式架构并利用AI模型的多模态能力。"

Certainly! Here's the translated text in simplified Chinese, keeping the HTML structure: ```html

随着您对这些技术越来越熟悉,您可以探索更高级的功能,如自定义分块方法、自定义提取提示以及将提取过程整合到更大的数据管道中。

``` This HTML snippet contains the translated text: "随着您对这些技术越来越熟悉,您可以探索更高级的功能,如自定义分块方法、自定义提取提示以及将提取过程整合到更大的数据管道中。"

Sure, here's the translated text in simplified Chinese while maintaining the HTML structure: ```html

Happy extracting, and may your data always be structured and your insights plentiful!

``` Translated into simplified Chinese: ```html

祝您提取数据愉快,愿您的数据始终结构化,洞见丰富!

``` This HTML snippet will display the translated message in a web context, formatted as a paragraph.

2024-07-17 04:20:51 AI中文站翻译自原文