<bdo id='8SXg5'></bdo><ul id='8SXg5'></ul>
<legend id='8SXg5'><style id='8SXg5'><dir id='8SXg5'><q id='8SXg5'></q></dir></style></legend><tfoot id='8SXg5'></tfoot>

<small id='8SXg5'></small><noframes id='8SXg5'>

  • <i id='8SXg5'><tr id='8SXg5'><dt id='8SXg5'><q id='8SXg5'><span id='8SXg5'><b id='8SXg5'><form id='8SXg5'><ins id='8SXg5'></ins><ul id='8SXg5'></ul><sub id='8SXg5'></sub></form><legend id='8SXg5'></legend><bdo id='8SXg5'><pre id='8SXg5'><center id='8SXg5'></center></pre></bdo></b><th id='8SXg5'></th></span></q></dt></tr></i><div id='8SXg5'><tfoot id='8SXg5'></tfoot><dl id='8SXg5'><fieldset id='8SXg5'></fieldset></dl></div>

      1. 在“from_delayed"JSON 文件中发现 DASK 元数据不匹配

        DASK Metadata mismatch found in #39;from_delayed#39; JSON file(在“from_delayedJSON 文件中发现 DASK 元数据不匹配)

        <tfoot id='5MThr'></tfoot>

            <i id='5MThr'><tr id='5MThr'><dt id='5MThr'><q id='5MThr'><span id='5MThr'><b id='5MThr'><form id='5MThr'><ins id='5MThr'></ins><ul id='5MThr'></ul><sub id='5MThr'></sub></form><legend id='5MThr'></legend><bdo id='5MThr'><pre id='5MThr'><center id='5MThr'></center></pre></bdo></b><th id='5MThr'></th></span></q></dt></tr></i><div id='5MThr'><tfoot id='5MThr'></tfoot><dl id='5MThr'><fieldset id='5MThr'></fieldset></dl></div>
            <legend id='5MThr'><style id='5MThr'><dir id='5MThr'><q id='5MThr'></q></dir></style></legend>
                <tbody id='5MThr'></tbody>

                <small id='5MThr'></small><noframes id='5MThr'>

                  <bdo id='5MThr'></bdo><ul id='5MThr'></ul>
                  本文介绍了在“from_delayed"JSON 文件中发现 DASK 元数据不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我刚刚从 DASK 开始我的冒险,并且我正在学习一个 json 格式的示例数据集.我知道对于初学者来说这不是世界上最简单的数据格式:)

                  I'm just starting my adventure with DASK and land I'm learning on an example dataset in json format. I know that this is not the easiest data format in the world for a beginner :)

                  我有一个 json 格式的数据集.我通过 dd.read_json 将数据加载到数据框,一切顺利.例如,compute()len() 函数会出现问题.

                  I have a dataset in the json format. I loaded the data via dd.read_json to dataframe and everything goes well. The problem occurred with, for example, the compute() or len() function.

                  我收到此错误:

                  ValueError: Metadata mismatch found in `from_delayed`.
                  
                  Partition type: `DataFrame`
                  +----------+-------+----------+
                  | Column   | Found | Expected |
                  +----------+-------+----------+
                  | column1  |   -   | object   |
                  | column2  |   -   | object   |
                  +----------+-------+----------+
                  

                  我尝试了不同的方法,但没有任何帮助.我不知道如何处理这个错误.

                  I tried different things, but nothing helps. I don't know how to handle this error.

                  请帮忙,我将不胜感激!

                  Please help, I will be very grateful !

                  推荐答案

                  我的猜测是你的 JSON 数据在数据的不同部分有不同的列.当 Dask DataFrame 加载您的 JSON 数据时,它会查看第一块数据以确定列名和数据类型是什么.然后它假设您的所有数据看起来像这样.

                  My guess is that your JSON data has different columns in different parts of the data. When Dask DataFrame loads your JSON data it looks at the first chunk of data to determine what the column names and datatypes are. It then assumes that all of your data looks like this.

                  这个假设在你的情况下是错误的,可能有一些列只出现在文件的后面.

                  This assumption turns out to be wrong in your case and probably there is some column that only appears later on in the file.

                  在确定列名等元数据时,您可能会考虑增加 Dask 读取的样本大小.

                  You might consider increasing the size of the sample that Dask reads when determining metadata like column names.

                  df = dd.read_json(..., sample=2**26)
                  

                  默认为 1MB (2**20)

                  The default is 1MB (2**20)

                  这篇关于在“from_delayed"JSON 文件中发现 DASK 元数据不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Adding config modes to Plotly.Py offline - modebar(将配置模式添加到 Plotly.Py 离线 - 模式栏)
                  Plotly: How to style a plotly figure so that it doesn#39;t display gaps for missing dates?(Plotly:如何设置绘图图形的样式,使其不显示缺失日期的间隙?)
                  python save plotly plot to local file and insert into html(python将绘图保存到本地文件并插入到html中)
                  Plotly: What color cycle does plotly express follow?(情节:情节表达遵循什么颜色循环?)
                  How to save plotly express plot into a html or static image file?(如何将情节表达图保存到 html 或静态图像文件中?)
                  Plotly: How to make a line plot from a pandas dataframe with a long or wide format?(Plotly:如何使用长格式或宽格式的 pandas 数据框制作线图?)
                  <legend id='mG26U'><style id='mG26U'><dir id='mG26U'><q id='mG26U'></q></dir></style></legend>

                    <bdo id='mG26U'></bdo><ul id='mG26U'></ul>

                      <tfoot id='mG26U'></tfoot>

                      • <small id='mG26U'></small><noframes id='mG26U'>

                        <i id='mG26U'><tr id='mG26U'><dt id='mG26U'><q id='mG26U'><span id='mG26U'><b id='mG26U'><form id='mG26U'><ins id='mG26U'></ins><ul id='mG26U'></ul><sub id='mG26U'></sub></form><legend id='mG26U'></legend><bdo id='mG26U'><pre id='mG26U'><center id='mG26U'></center></pre></bdo></b><th id='mG26U'></th></span></q></dt></tr></i><div id='mG26U'><tfoot id='mG26U'></tfoot><dl id='mG26U'><fieldset id='mG26U'></fieldset></dl></div>
                            <tbody id='mG26U'></tbody>