Android教程網
  1. 首頁
  2. Android 技術
  3. Android 手機
  4. Android 系統教程
  5. Android 游戲
 Android教程網 >> Android技術 >> 關於Android編程 >> Objective-C 實現Equality and Hashing

Objective-C 實現Equality and Hashing

編輯:關於Android編程

Implementing Equality and Hashing by Mike Ash

Welcome back to a late edition of Friday Q&A. WWDC pushed the schedule back one week, but it's finally time for another one. This week, I'm going to discuss the implementation of equality and hashing in Cocoa, a topic suggested by Steven Degutis.


Equality
Object equality is a fundamental concept that gets used all over the place. In Cocoa, it's implemented with the isEqual:method. Something as simple as [array indexOfObject:] will use it, so it's important that your objects support it.

對象比較是相當基本的內容,代碼中隨處可見。在Cocoa編程中,可以通過 isEqual: 方法進行比較。這個方法的使用就像 [array indexOfObject:] 一樣簡單。所以自定義類對象也應該支持該方法。


It's so important that Cocoa actually gives us a default implementation of it on NSObject. The default implementation just compares pointers. In other words, an object is only equal to itself, and is never equal to another object. The implementation is functionally identical to:

在Cocoa編程中,NSObject類已經提供了該方法的默認實現。這個默認實現的方法僅僅進行指針比較。換言之,一個對象只能和它本身相等,不可能和其他對象相等。其實現過程類似於:

    - (BOOL)isEqual: (id)other
    {
        return self == other;
    }


While oversimplified in many cases, this is actually good enough for a lot of objects. For example, an NSView is never considered equal to another NSView, only to itself. For NSView, and many other classes which behave that way, the default implementation is enough. That's good news, because it means that if your class has that same equality semantic, you don't have to do anything, and get the correct behavior for free.

盡管這個方法對於大多數情況都過於簡單,但事實上,對於很多對象都是十分有用的。例如,一個 NSView 對象是不可能和其他的 NSView 對象相等的,只能和其本身相等。對於NSView,或者其他具有該特性的類對象來說,這個默認的 isEqual: 方法實現已經足夠。這或許是個好消息,因為如果你的類有著與此相同的語義,那麼你不需要額外的工作,就可以直接使用 isEqual: 。


Implementing Custom Equality
Sometimes you need a deeper implementation of equality. It's common for objects, typically what you might refer to as a "value object", to be distinct from another object but be logically equal to it. For example:

有時候,你需要自定義實現這個方法。通常對於大多數對象,尤其是指值對象("value object")的時候,用於區分邏輯上相同,但是不同的兩個對象,例如:

    // use mutable strings because that guarantees distinct objects
    NSMutableString *s1 = [NSMutableString stringWithString: @"Hello, world"];
    NSMutableString *s2 = [NSMutableString stringWithFormat: @"%@, %@", @"Hello", @"world"];
    BOOL equal = [s1 isEqual: s2]; // gives you YES!


Of course NSMutableString implements this for you in this case. But what if you have a custom object that you want to be able to do the same thing?

當然,在這個例子中 NSMutableString 已經實現了isEqual: 。但是如果是自定義的對象呢?

    MyClass *c1 = ...;
    MyClass *c2 = ...;
    BOOL equal = [c1 isEqual: c2];


In this case you need to implement your own version of isEqual:.

在這個例子中,你需要自己實現 isEqual: 方法。


Testing for equality is fairly straightforward most of the time. Gather up the relevant properties of your class, and test them all for equality. If any of them are not equal, then return NO. Otherwise, return YES.

檢測對象是否相等是相當簡單的。收集類對象中的相關屬性,依次檢測它們是否相等。如果它們中有一個不相等,就返回 NO;否則,返回 YES。


One subtle point with this is that the class of your object is an important property to test as well. It's perfectly valid to test a MyClass for equality with an NSString, but that comparison should never return YES (unless MyClass is a subclass of NSString, of course).

對於對象的比較有一個比較有趣的是,將一個自定義類 MyClass 對象和 NSString 對象進行比較是完全有效的,但是這個比較不可能返回 YES。(除非 MyClass 是 NSString 的子類)


A somewhat less subtle point is to ensure that you only test properties that are actually important to equality. Things like caches that do not influence your object's externally-visible value should not be tested.

在比較中,還有一點需要保證的是:用於檢測的屬性必須是相當重要的。例如:緩存屬性(cache),由於不會影響到對象的外部可見值,所以沒必要進行比較。


Let's say your class looks like this:

看看下面的這個例子:

    @interface MyClass : NSObject
    {
        int _length;
        char *_data;
        NSString *_name;
        NSMutableDictionary *_cache;
    }


Your equality implementation would then look like this:

isEqual:方法的實現如下:

    - (BOOL)isEqual: (id)other
    {
        return ([other isKindOfClass: [MyClass class]] &&
                [other length] == _length &&
                memcmp([other data], _data, _length) == 0 &&
                [[other name] isEqual: _name])
                // note: no comparison of _cache
    }



Hashing

Hash tables are a commonly used data structure which are used to implement, among other things, NSDictionary and NSSet. They allow fast lookups of objects no matter how many objects you put in the container.

哈希表是用來實現諸如:NSDictionary 和 NSSet的一種常用數據結構。它允許對對象進行快速查找,無論容器中有多少對象。


If you're familiar with how hash tables work, you may want to skip the next paragraph or two.

如果你熟悉哈希表是如何工作的,你可以跳過這兩個段落。


A hash table is basically a big array with special indexing. Objects are placed into an array with an index that corresponds to their hash. The hash is essentially a pseudorandom number generated from the object's properties. The idea is to make the index random enough to make it unlikely for two objects to have the same hash, but have it be fully reproducible. When an object is inserted, the hash is used to determine where it goes. When an object is looked up, its hash is used to determine where to look.

哈希表是一個有著特殊索引的大數組。對象放置到數組中,其下標為與之對應的哈希值。哈希本質上是從對象的屬性產生的偽隨機數。這樣做的目的是使索引盡可能的隨機,使得不可能兩個對象不可能具有相同的哈希值,但是它是完全可以重復的。當插入一個對象時,哈希值決定其位置;當查找一個對象是,哈希值確定其位置。


In more formal terms, the hash of an object is defined such that two objects have an identical hash if they are equal. Note that the reverse is not true, and can't be: two objects can have an identical hash and not be equal. You want to try to avoid this as much as possible, because when two unequal objects have the same hash (called a collision) then the hash table has to take special measures to handle this, which is slow. However, it's provably impossible to avoid it completely.

在更正式的術語中,如果兩個對象具有相同的哈希值,那麼它們應該是相等的。注意,反之則不正確。而且不可能兩個對象具有相同的哈希值,但是兩個對象不相等。你應該盡量避免這種可能出現的情況 -- 即兩個不同的對象具有相同的哈希值(稱之為 碰撞),出現碰撞,哈希表必須采取特殊的措施來處理這個問題。然而,這被證明是無法完全避免的。


In Cocoa, hashing is implemented with the hash method, which has this signature:

在Cocoa編程中,哈希函數通過 hash 方法實現,其方法聲明為:

    - (NSUInteger)hash;


As with equality, NSObject gives you a default implementation that just uses your object's identity. Roughly speaking, it does this:

正如相等性比較的方法,NSObject 已經提供了一個默認的實現,如下:

    - (NSUInteger)hash
    {
        return (NSUInteger)self;
    }

The actual value may differ, but the essential point is that it's based on the actual pointer value of self. And just as with equality, if object identity equality is all you need, then the default implementation will do fine for you.

實際的值可能有所不同的,但重點是,它是基於自身實際的指針值 self 。並且正如相等性比較,如果一個對象標識就是你所需的,那麼默認的實現就可以了。


Implementing Custom Hashing
Because of the semantics of hash, if you override isEqual: then you must override hash. If you don't, then you risk having two objects which are equal but which don't have the same hash. If you use these objects in a dictionary, set, or something else which uses a hash table, then hilarity will ensue.

因為 hash 函數的語義,所以如果你重載了 isEqual 方法,你就必須重載 hash 方法。如果你沒有,就有可能出現兩個對象相等,但是卻有著不同的哈希值。如果你在字典或者集合等等中使用這些對象,就有可能出錯。


Because the definition of the object's hash follows equality so closely, the implementation of hash likewise closely follows the implementation of isEqual:.

因為對象哈希值的定義和相等性比較關系密切,所以 hash 方法的實現和 isEqual 方法的實現有關。


An exception to this is that there's no need to include your object's class in the definition of hash. That's basically a safeguard in isEqual: to ensure the rest of the check makes sense when used with a different object. Your hash is likely to be very different from the hash of a different class simply by virtue of hashing different properties and using different math to combine them.



Generating Property Hashes
Testing properties for equality is usually straightforward, but hashing them isn't always. How you hash a property depends on what kind of object it is.

檢測對象屬性是否相等是簡單的,但是計算哈希值通常卻不是簡單的。如何計算一個屬性的哈希值依賴於什麼樣的數據。


For a numeric property, the hash can simply be the numeric value.

對於一個數值型的屬性,哈希值可以是該數值。


For an object property, you can send the object the hash method, and use what it returns.

對於一個對象的屬性,你可以使用這個對象的哈希方法返回的值。


For data-like properties, you'll want to use some sort of hash algorithm to generate the hash. You can use CRC32, or even something totally overkill like MD5. Another approach, somewhat less speedy but easy to use, is to wrap the data in an NSData and ask it for its hash, essentially offloading the work onto Cocoa. In the above example, you could compute the hash of _data like so:

對於數據類的屬性,你需要使用某種哈希算法來生成哈希值。你可以使用 CRC32,或者MD5。另一種方法雖然低效但是方便的是,將數據封裝在NSData中,調用 hash 方法即可。你可以計算 _data 的哈希值,像這樣:

    [[NSData dataWithBytes: _data length: _length] hash]



Combining Property Hashes

So you know how to generate a hash for each property, but how do you put them together?

現在你知道如何生成不同屬性的哈希值,但是如何將他們放到一起呢?


The easiest way is to simply add them together, or use the bitwise xor property. However, this can hurt your hash's uniqueness, because these operations are symmetric, meaning that the separation between different properties gets lost. As an example, consider an object which contains a first and last name, with the following hash implementation:

最簡單的方法是將他們簡單的加起來,或者使用位運算XOR異或。然而,這可能會影響哈希的唯一性,因為這些操作是對稱的,這意味著不同屬性之間的差異性的丟失。例如,一個對象有first name 和 last name,有如下的哈希實現:

    - (NSUInteger)hash
    {
        return [_firstName hash] ^ [_lastName hash];
    }


Now imagine you have two objects, one for "George Frederick" and one for "Frederick George". They will hash to the same value even though they're clearly not equal. And, although hash collisions can't be avoided completely, we should try to make them harder to obtain than this!

現在假如有這樣兩個對象,一個是 "George Frederick",另一個是 "Frederick George" 。這將會導致二者的哈希值是相同的,盡管這是兩個不同的對象。雖然,哈希碰撞是不可避免的,但是我們應該設法不出現這種情況。


How to best combine hashes is a complicated subject without any single answer. However, any asymmetric way of combining the values is a good start. I like to use a bitwise rotation in addition to the xor to combine them:

如何最好的將所有的哈希值合並起來是復雜的,並且答案不唯一。然而,任何非對稱的結合方式是一個好的主意。我喜歡使用移位和異或

    #define NSUINT_BIT (CHAR_BIT * sizeof(NSUInteger))
    #define NSUINTROTATE(val, howmuch) ((((NSUInteger)val) << howmuch) | (((NSUInteger)val) >> (NSUINT_BIT - howmuch)))
    
    - (NSUInteger)hash
    {
        return NSUINTROTATE([_firstName hash], NSUINT_BIT / 2) ^ [_lastName hash];
    }



Custom Hash Example

Now we can take all of the above and use it to produce a hash method for the example class. It follows the basic form of the equality method, and uses the above techniques to obtain and combine the hashes of the individual properties:

現在,我們可以使用上面提到的內容來產生一個哈希方法。如下:

    - (NSUInteger)hash
    {
        NSUInteger dataHash = [[NSData dataWithBytes: _data length: _length] hash];
        return NSUINTROTATE(dataHash, NSUINT_BIT / 2) ^ [_name hash];
    }


If you have more properties, you can add more rotation and more xor operators, and it'll work out just the same. You'll want to adjust the amount of rotation for each property to make each one different.

如果有更多的屬性,你可以添加移位和異或操作來計算哈希值。你需要使用移位來調整每一個屬性。


A Note on Subclassing
You have to be careful when subclassing a class which implements custom equality and hashing. In particular, your subclass should not expose any new properties which equality is dependent upon. If it does, then it must not compare equal with any instances of the superclass.

當你子類化某一類,在自定義實現 isEqual 和 hash 的時候需要注意。特別是,你的子類不應該暴露任何新的與 isEqual 方法相關的屬性。


To see why, consider a subclass of the first/last name class which includes a birthday, and includes that as part of its equality computation. It can't include it when comparing equality with an instance of the superclass, though, so its equality method would look like this:

想知道為什麼,假設子類化一個有 first 和 last name的類,該子類有一個 birthday 屬性。其對象的比較方法 isEqual 代碼如下:

    - (BOOL)isEqual: (id)other
    {
        // if the superclass doesn't like it then we're not equal
        if(![super isEqual: other])
            return NO;
        
        // if it's not an instance of the subclass, then trust the superclass
        // it's equal there, so we consider it equal here
        if(![other isKindOfClass: [MySubClass class]])
            return YES;
        
        // it's an instance of the subclass, the superclass properties are equal
        // so check the added subclass property
        return [[other birthday] isEqual: _birthday];
    }


Now you have an instance of the superclass for "John Smith", which I'll call A, and an instance of the subclass for "John Smith" with a birthday of 5/31/1982, which I'll call B. Because of the definition of equality above, A equals B, and B also equals itself, which is expected.

現在有一個超類對象 A :"John Smith";還有一個子類對象 B:"John Smith",其birthday 屬性:5/31/1982。因為有如上的 isEqual 方法,所以 A 和 B相等,B和其本身也相等。


Now consider an instance of the subclass for "John Smith" with a birthday of 6/7/1994, which I'll call C. C is not equal to B, which is what we expect. C is equal to A, also expected. But now there's a problem. A equals both B and C, but B and C do not equal each other! This breaks the standard transitivity of the equality operator, and leads to extremely unexpected results.

現在,有一個子類對象 C:"John Smith" ,其birthday屬性:6/7/1994。C和B不相等,但C和A相等。但是現在就出現一個問題了:A和B、C相等,但是B和C卻不相等!這和相等運算符的傳遞性不相符,導致處理不可預期的錯誤。


In general this should not be a big problem. If your subclass adds properties which influence object equality, that's probably an indication of a design problem in your hierarchy anyway. Rather than working around it with weird implementations of isEqual:, consider redesigning your class hierarchy.

一般來說,這個不算是什麼大問題。如果子類增加了屬性而影響了對象的相等性,這很有可能是在設計類繼承上的問題。不用總圍繞著 isEqual 的方法實現,考慮一下重新設計你的類繼承。


A Note on Dictionaries
If you want to use your object as a key in an NSDictionary, you need to implement hashing and equality, but you also need to implement -copyWithZone:. Techniques for doing that are beyond the scope of today's post, but you should be aware that you need to go a little bit further in that case.

如果你想將你自定義的對象作為 NSDictionary 的key值,你就需要實現 hash 方法和 isEqual 方法,而且還需要實現 copyWithZone 方法。這方面的內容已經超出本文的介紹,你可以通過其他途徑了解有關內容。


Conclusion
Cocoa provides default implementations of equality and hashing which work for many objects, but if you want your objects to be considered equal even when they're distinct objects in memory, you have to do a bit of extra work. Fortunately, it's not difficult to do, and once you implement them, your class will work seamlessly with many Cocoa collection classes.

Cocoa編程中已經提供了默認的 isEqual 方法和 hash 方法實現,在很多對象中都很有用,但是如果你想要你自定義的對象可以在內存級別進行相等性比較,你應該有一些額外的處理。幸運的是,這些都比較簡單,一旦你實現了它們,可以在 Cocoa中的集合類中使用這些自定義類對象。


  1. 上一頁:
  2. 下一頁:
熱門文章
閱讀排行版
Copyright © Android教程網 All Rights Reserved