Bug 14016 - Font.GetHashCode() still not unique enough
Summary: Font.GetHashCode() still not unique enough
Status: NEW
Alias: None
Product: Class Libraries
Classification: Mono
Component: System.Drawing ()
Version: 2.10.x
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-08-15 17:09 UTC by Giancarlo Villanueva
Modified: 2013-08-28 01:31 UTC (History)
2 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report for Bug 14016 on GitHub or Developer Community if you have new information to add and do not yet see a matching new report.

If the latest results still closely match this report, you can use the original description:

  • Export the original title and description: GitHub Markdown or Developer Community HTML
  • Copy the title and description into the new report. Adjust them to be up-to-date if needed.
  • Add your new information.

In special cases on GitHub you might also want the comments: GitHub Markdown with public comments

Related Links:
Status:
NEW

Description Giancarlo Villanueva 2013-08-15 17:09:29 UTC
Font.GetHashCode() has a high number of hash collisions and these collisions occur in many common scenarios:

1.  Fonts created with different font units return the same hashcode

    [Test]
    //a font's units were not previously considered as part of the hash
    //function, generating the same hash for differently sized fonts
    //e.g., given one font with em-height of 1-inch and another with em-height
    //of 1-pixel, the same hashcode is returned
    public void GetHashCode_UnitDiffers_HashesNotEqual()
    {
        Font f1 = new Font("DejaVu Sans", 8.25F, GraphicsUnit.Point);
        Font f2 = new Font("DejaVu Sans", 8.25F, GraphicsUnit.Pixel);

        Assert.IsFalse(f1.GetHashCode() == f2.GetHashCode(),
            "Hashcodes should differ if _unit member differs");
    }

2.  Fonts with different names return the same hashcode

    [Test]
    //fonts whose Font name and FontFamily name are the same
    //generate the same hashcode and the XOR nullifies them
    //e.g., given two very different fonts, because _name.GetHashCode() and
    //FontFamily.GetHashCode() are the same, the XOR of their hashcodes always
    //returns 0, and the same font hashcode is generated
    public void GetHashCode_NameDiffers_HashesNotEqual()
    {
        Font f1 = new Font("DejaVu Sans", 8.25F, GraphicsUnit.Point);
        Font f2 = new Font("Liberation Sans", 8.25F, GraphicsUnit.Point);

        Assert.IsFalse(f1.GetHashCode() == f2.GetHashCode(),
            "Hashcodes should differ if _name member differs");
    }


3.  Fonts where (int)_style == (int)_gdiCharSet.  This probably also occurs when (int)_gdiCharSet == (int)gdiVerticalFont, but I didn't write a test case for it.

    [Test]
    //fonts that are the same except for the style and gdiCharSet
    //will return the same hashcode if the style == gdiCharSet
    //e.g., given two similar fonts where the first has
    //FontStyle.Regular and gdiCharSet = 0 and the second has
    //FontStyle.Bold and gdiCharSet = 1, the same hashcode is generated
    public void GetHashCode_StyleEqualsGdiCharSet_HashesNotEqual()
    {
        Font f1 = new Font("DejaVu Sans", 8.25F, FontStyle.Regular, GraphicsUnit.Point, ((byte)(0)));
        Font f2 = new Font("DejaVu Sans", 8.25F, FontStyle.Bold, GraphicsUnit.Point, ((byte)(1)));

        Assert.IsFalse(f1.GetHashCode() == f2.GetHashCode(),
            "Hashcodes should differ if _style member differs");
    }

To resolve these test cases, I rewrote the Font.GetHashCode() using some advice on a generic hash function strategy I found online.  My suggested implementation is the following:

    public override int GetHashCode ()
    {
        int hash = 17;
        unchecked
        {
            hash = hash * 23 + _name.GetHashCode();
            hash = hash * 23 + FontFamily.GetHashCode();
            hash = hash * 23 + _size.GetHashCode();
            hash = hash * 23 + _unit.GetHashCode();
            hash = hash * 23 + _style.GetHashCode();
            hash = hash * 23 + _gdiCharSet;
            hash = hash * 23 + _gdiVerticalFont.GetHashCode();
        }
        return hash;
    }

References:
[1] https://bugzilla.novell.com/show_bug.cgi?id=351647 - Previous work done to improve this method.
Comment 1 Giancarlo Villanueva 2013-08-15 17:14:39 UTC
Background:
My project ran into the issues with Font.GetHashCode() when we encountered erroneous TextBox rendering.  We sourced the cause of the issue to Font.GetHashCode() after reading the comments in TextBoxTextRenderer.MeasureText(Graphics, string, Font).  The performance optimizing code for the TextBox uses the font's hashcode + the character as a key to the precalculated character size.  It was due to the hash collisions within Font.GetHashCode() that we encountered rendering issues with the TextBox. The collisions caused the TextBox to measure using precalculated sizes from a variety of fonts.
Comment 2 Rodrigo Kumpera 2013-08-20 16:40:51 UTC
Hi Giancarlo,

Can you make a pull request out of this change? And since the new code does a lot of hashing, it makes sense to cache the hashcode in a field.

Thanks,
Rodrigo
Comment 3 Giancarlo Villanueva 2013-08-28 01:31:47 UTC
Rodrigo,

Just submitted the pull request with cached hashcode changes included.  It's my first attempt at a pull request; let me know if I messed up.  

Thanks,
Giancarlo