反序列化后哈希图变慢 - 为什么?

Hashmap slower after deserialization - Why?(反序列化后哈希图变慢 - 为什么?)
本文介绍了反序列化后哈希图变慢 - 为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我有一个相当大的 Hashmap (~250MB).创建它大约需要 50-55 秒,所以我决定将它序列化并保存到一个文件中.现在从文件中读取大约需要 16-17 秒.

I have a pretty large Hashmap (~250MB). Creating it takes about 50-55 seconds, so I decided to serialize it and save it to a file. Reading from the file takes about 16-17 seconds now.

唯一的问题是这种方式的查找速度似乎较慢.我一直以为hashmap是从文件中读入内存的,所以性能应该和我自己创建hashmap的情况是一样的吧?这是我用来将哈希图读入文件的代码:

The only problem is that lookups seems to be slower this way. I always thought that the hashmap is read from the file into the memory, so the performance should be the same compared to the case when I create the hashmap myself, right? Here is the code I am using to read the hashmap into a file:

File file = new File("omaha.ser");
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
omahaMap = (HashMap<Long, Integer>) s.readObject();
s.close();

当我自己创建哈希图时,3 亿次查找大约需要 3.1 秒,而当我从文件中读取相同的哈希图时,大约需要 8.5 秒.有人知道为什么吗?我是否忽略了一些明显的东西?

300 million lookups take about 3.1 seconds when I create the hashmap myself, and about 8.5 seconds when I read the same hashmap from file. Does anybody have an idea why? Am I overlooking something obvious?

我只是通过 System.nanotime() 来测量"时间,因此没有使用适当的基准测试方法.代码如下:

I "measured" the time by just taking the time with System.nanotime(), so no proper benchmark method used. Here is the code:

public class HandEvaluationTest
{
    public static void Test()
    {

        HandEvaluation.populate5Card();
        HandEvaluation.populate9CardOmaha();


        Card[] player1cards = {new Card("4s"), new Card("2s"), new Card("8h"), new Card("4d")};
        Card[] player2cards = {new Card("As"), new Card("9s"), new Card("6c"), new Card("2h")};
        Card[] player3cards = {new Card("9h"), new Card("7h"), new Card("Kc"), new Card("Kh")};
        Card[] table = {new Card("2d"), new Card("2c"), new Card("3c"), new Card("5c"), new Card("4h")};


        int j=0, k=0, l=0;
        long startTime = System.nanoTime();
        for(int p=0; p<100000000; p++)    {
           j = HandEvaluation.handEval9Hash(player1cards, table);
            k = HandEvaluation.handEval9Hash(player2cards, table);
            l = HandEvaluation.handEval9Hash(player3cards, table);

        }
        long estimatedTime = System.nanoTime() - startTime;
        System.out.println("Time needed: " + estimatedTime*Math.pow(10,-6) + "ms");
        System.out.println("Handstrength Player 1: " + j);
        System.out.println("Handstrength Player 2: " + k);
        System.out.println("Handstrength Player 3: " + l);
    }
}

大的 hashmap 工作在 HandEvaluation.populate9CardOmaha() 中完成.5 张牌很小.大一号的代码:

The big hashmap work is done in HandEvaluation.populate9CardOmaha(). The 5-card one is small. The code for the big one:

 public static void populate9CardOmaha()
        {

            //Check if the hashmap is already there- then just read it and exit
            File hashmap = new File("omaha.ser");
            if(hashmap.exists())
            {
                try
                {
                    File file = new File("omaha.ser");
                    FileInputStream f = new FileInputStream(file);
                    ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
                    omahaMap = (HashMap<Long, Integer>) s.readObject();
                    s.close();
                }
                catch(IOException ioex) {ioex.printStackTrace();}
                catch(ClassNotFoundException cnfex)
                {
                    System.out.println("Class not found");
                    cnfex.printStackTrace();
                    return;
                }
                return;
            }

    // if it's not there, populate it yourself
    ... Code for populating hashmap ...
    // and then save it to file
          (

            try
            {
                File file = new File("omaha.ser");
                FileOutputStream f = new FileOutputStream(file);
                ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
                s.writeObject(omahaMap);
                s.close();
            }
            catch(IOException ioex) {ioex.printStackTrace();}
        }

当我自己填充它时(= 文件不在此处),HandEvaluationTest.Test() 中的查找大约需要 8 秒而不是 3 秒.也许这只是我测量经过时间的非常天真的方式?

When i am populating it myself (= file is not here), lookups in the HandEvaluationTest.Test() take about 8 seconds instead of 3. Maybe it's just my very naive way of measuring the time elapsed?

推荐答案

这个问题很有意思,所以我自己写了一个测试用例来验证一下.我发现实时查找与从序列化文件加载的查找速度没有差异.任何有兴趣运行它的人都可以在文章末尾找到该程序.

This question was interesting, so I wrote my own test case to verify it. I found no difference in speed for a live lookup Vs one that was loaded from a serialized file. The program is available at the end of the post for anyone interested in running it.

  • 使用 JProfiler 监控方法.
  • 序列化文件与您的类似.~ 230 MB.
  • 在没有任何序列化的情况下在内存中查找花费 1210 毫秒

  • 序列化地图并再次读取它们后,查找成本保持不变(几乎 - 1224 毫秒)

  • 对探查器进行了调整,以在两种情况下都增加最小的开销.
  • 这是在 Java(TM) SE 运行时环境(内部版本 1.6.0_25-b06)/4 个 1.7 Ghz 运行的 CPU/4GB Ram 800 上测量的兆赫

测量很棘手.我自己注意到了您描述的 8 秒 查找时间,但猜猜我在发生这种情况时还注意到了什么.

Measuring is tricky. I myself noticed the 8 second lookup time that you described, but guess what else I noticed when that happened.

您的测量结果可能也反映了这一点.如果您单独隔离 Map.get() 的测量值,您会发现结果具有可比性.

Your measurements are probably picking that up too. If you isolate the measurements of Map.get() alone you'll see that the results are comparable.

public class GenericTest
{
    public static void main(String... args)
    {
        // Call the methods as you please for a live Vs ser <-> de_ser run
    }

    private static Map<Long, Integer> generateHashMap()
    {
        Map<Long, Integer> map = new HashMap<Long, Integer>();
        final Random random = new Random();
        for(int counter = 0 ; counter < 10000000 ; counter++)
        {
            final int value = random.nextInt();
            final long key = random.nextLong();
            map.put(key, value);
        }
        return map;
    }

    private static void lookupItems(int n, Map<Long, Integer> map)
    {
        final Random random = new Random();
        for(int counter = 0 ; counter < n ; counter++)
        {
            final long key = random.nextLong();
            final Integer value = map.get(key);
        }
    }

    private static void serialize(Map<Long, Integer> map)
    {
        try
        {
            File file = new File("temp/omaha.ser");
            FileOutputStream f = new FileOutputStream(file);
            ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
            s.writeObject(map);
            s.close();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

    private static Map<Long, Integer> deserialize()
    {
        try
        {
            File file = new File("temp/omaha.ser");
            FileInputStream f = new FileInputStream(file);
            ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
            HashMap<Long, Integer> map = (HashMap<Long, Integer>) s.readObject();
            s.close();
            return map;
        }
        catch (Exception e)
        {
            throw new RuntimeException(e);
        }
    }
}

这篇关于反序列化后哈希图变慢 - 为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Show progress during FTP file upload in a java applet(在 Java 小程序中显示 FTP 文件上传期间的进度)
How to copy a file on the FTP server to a directory on the same server in Java?(java - 如何将FTP服务器上的文件复制到Java中同一服务器上的目录?)
FTP zip upload is corrupted sometimes(FTP zip 上传有时会损坏)
Enable logging in Apache Commons Net for FTP protocol(在 Apache Commons Net 中为 FTP 协议启用日志记录)
Checking file existence on FTP server(检查 FTP 服务器上的文件是否存在)
FtpClient storeFile always return False(FtpClient storeFile 总是返回 False)