ProtocolDiscussion |
Edit | Print this page |
|
SIQ Protocol Internet Draft submitted to IETF: http://www.ietf.org/internet-drafts/draft-irtf-asrg-iar-howe-siq-02.txt http://www.ietf.org/internet-drafts/draft-irtf-asrg-iar-howe-siq-01.txt http://www.ietf.org/internet-drafts/draft-irtf-asrg-iar-howe-siq-00.txt http://www.networksorcery.com/enp/Protocol.htm#S ~ Feb 18th 2004 we decided to split the protocol used for SIQueries? into two protocols, to avoid the conflicting goals presented by forcing it all into one protocol: high speed, anti-spoofing, server being able to trust clients to send accurate data. SNMPForDummies SNMP traps is proposed as the protocol for trusted-client data sent from client to server http://sourceforge.net/projects/snmpy/ A python implementation of SNMP http://www.wtcs.org/snmp4tpc/snmp_rfc.htm SNMP resouces
Support for multiple scores for a domain IP pair --Robert Barclay, Fri, 22 Oct 2004 12:42:58 -0400 reply
Queries that specify criteria to use or type of score to return --AprilDL, Sun, 24 Oct 2004 08:45:09 -0400 reply I was strongly in favor of including a mechanism for the query client to control the processing that occurs in the query server. In fact the SIQ protocol does have this although it wasn't named or described that way. Request from you and others, as well as specific examples of the need and result could have the effect of getting us to explain this usage in the protocol specification and/or some other adaptation. Please refer to section 3.1 UDP Query Format, under the description of "RD": "or other characters may be sent here to specify other than default processing" - The amount of space reserved for this results in a huge number of different choices for the type of processing or type of score you want return. I believe Anthony spoke against this because his, and Derek's view as I understand them - is that a large number of independent SIQ servers exist in the world - and the query client might choose to query many different ones, and they might not all offer the same or know the same RD codes for "how to process." Perhaps in a random, anonymous sort of query / response system more like blacklists of today. My view, in contrast, is that the query client and query server KNOW EACH OTHER and have clear policies and agreements about what a request where RD="x" or y or z means in terms of this particular query client. In my view, the query client selects a particular SIQ server network which will give him the exact type of processing of queries which he wants. Please give examples of the number of different kinds of scores you would want returned, or different "things you want to know", also if my answer has demonstrated that what you want to do is already supported by the SIQ protocol. Thank you,
SIQ client independence -- Sun, 24 Oct 2004 09:35:29 -0400 reply Essentially the SIQ protocol is design to be ignorant of the SIQ server implementation. You don't want to put server specific implementation details into a protocol, unless you can clearly make a case that it is universally required. This allows for any SIQ client to query any SIQ server in a universal and consistant way. Think how DNS, SMTP, and POP are so universal in nature, yet have the possibilities for extension beyond the basic minimum. The UDP protocol does provide for 3 standard type of scores. Originally I opted for one single score, but April convinced me of the utility of two additional types of scores (see the draft). A UDP packet has to be well defined and allocating extra fields for server-specific details is not a good idea for max. portability, unless it can be shown that all SIQ servers would implement the functionality demanded by new fields. The UDP format does allow for the RD section to be used for "other" data, though maybe this should be stated more clearly in the Rationale which discusses it. An HTTP query and its response are more flexible for custom server-specific details. This is because extra HTTP headers can be easily added in both the query and response without impacting the standard elements returned. A basic SIQ server would ignore unknown / unsupported headers in a query from a specialised SIQ client and return a basic response, while an enhanced SIQ-server could just add supplemental X-SIQ-* headers to the response without impacting a basic SIQ client. The nice things about the current draft protocol is that different SIQ client implementations can work with any SIQ server implementation. I can install milter-siq and use Outbound-Index one day then switch to another SIQ server if I'm not satisified, or even use many SIQ servers at once and compare the results without having to change my installed software. Anthony Howe
Re: SIQ client independence --Robert Barclay, Tue, 26 Oct 2004 17:46:25 -0400 reply First, the document says in several places that the method of deriving the scores, and their meanings are outside the scope of the protocol. This seems on its face to contradict your desire to be able to interchangeably query SIQ servers and compare the results. Unless the scores all mean the same thing there is no way to gaurantee that the comparison is meaningful unless you already know something about the provider of the score and thier methods. It sounds from the above post that the goal is for a single entity to be able to publish a single score rating how likely they think it is tat a receiver should accept email from a source. I think this is an unecessary limitation and that this protocol could be used to provide much richer reputation data. The protocols you gave I think provide some interesting examples of dealing with exactly the problem of multiple types of data being exchanged over a single protocol. Both of these protocols have mechanisms for a client to specify exactly what type of data they are sending to, or expect to receive from the server. Both protocols also have extension mechanisms built into them for clients and servers to exchange data outside of the published spec (through ESMTP commands, or through querying unassigned type codes in DNS). In short I think the ability to use the protocol to more flexibly exchange elements of server reputation is one of the biggest potential benefits of SIQ over further expansion of existing DNSBL formats and is by itself valuable enough to have a specific extension mechanism within the UDP query. This is especially true if this can be done in a way that does not break the base functionality.
Flexibility --AprilDL, Wed, 27 Oct 2004 08:56:43 -0400 reply Looking forward to the examples. My greatest hope is that although there may be many creative and different ways of using it, the protocol will eventually if not now be both flexible and boundaried enough to serve them all well.
Examples of client data impacting server response --Robert Barclay, Wed, 27 Oct 2004 13:23:35 -0400 reply Example #1- A large receiver already has a fairly substantial set of data on which to base a reputation, but it is all local. Their primary use for a reputation system is for additional data elements to add into their system, if the service has some unique or interesting new data. They would like to be able to choose specific data elements from the ones a reputation service offers and add those into their system. They would within their query to the system specify the specific elements they are interested in. They could either query a individual data element and receive the score for that element along with some other data in the response, or they could query a set of elements and receive a score aggregated across those specific elements along with the individual scores in the text part of the response. Example #2- A data aggregator allows email receivers to create their own custom score, deciding which elements to weight and how heavily. The score calculation is done by the aggregator and then queried by the receiver in their mail stream. The server needs a mechanism to know which custom score it should return, or if it should return some default calculated score. The default score in this case could be identical to existing SIQ implementations. In the case of a custom score the query format would vary somewhat but the respnse format would look identical (just with a different score). Do these two help? If not I can probably come up with some others (thought they may not be based on specific discussions I have had with potential consumers of reputation data).
Re: SIQ client independence --Anthon Howe, Mon, 01 Nov 2004 04:21:58 -0500 reply By NOT specifying how the scores are generated here then several options are open: a separate specification can be put forward by IAR to cover scoring (either loosely or strictly); SIQ server implementers have the option to patent and/or charge for their service, so market forces should govern who has the better method. I think a specification on scoring would NOT be a good idea; it might limit a SIQ server implementer's options as to what they can do and how they might innovate, and more likely that we could never generate consensus on how scoring should be done - if scoring were like a sports match, then it would be easy enough to create rules concerning how to score, but I think there are so many possible variables to consider in a reputation system that its best one can do is specify how the I/O should be presented (the SIQ protocol) and allow for extensions to fine tune the process for sites. § Concerning protocol tailored request/responses: Your two examples given should be possible within the framework the current protocol. I personally I'm not sure I like the RCPT portion of the query to be multi purpose, preferring to define that field, be it empty, and say what follows in the remainder of the packet is server specific query data. The HTTP version is certainly flexible enough to accommodate your desired request format and would be the recommended method. The Outbound Index tends to fall in the second example. I personally don't want to fuss with tuning a server's scoring system, having enough other things to tweak, so I'm not attracted to such features - IMHO the more accurate yet automated the process the more comfortable I am - if I don't like the results, I can change servers. This has been argued over many time by April and I. Now something else that is being overlook here, in particular with the UDP format is the VERSION field. For maximum customization, we could modify the protocol to state that if bit 0 from the start of the packet (the high-order bit of the VERSION) is set, then packet bits 1-7 (or 1-32) are a SIQ server specific packet version code and the rest of the UDP packet is SIQ server specific. We could call on IANA to record/register these variants, if we used 1-32. If we used the shorter version code packet bits 1-7, giving VERSION code range 128-255, then we simply state that a SIQ client/server have an intimate relationship in which they agree; a basic SIQ server receiving a custom request could simply reply with VERSION=1 SCORE=UNKNOWN.
Further on the above examples --Robert Barclay, Tue, 02 Nov 2004 12:32:47 -0500 reply
Re: SIQ client independence --Robert Barclay, Tue, 02 Nov 2004 12:58:40 -0500 reply Concerning protocol tailored request/responses: I concur that both of these examples can be done within the existing framework. I just am not sure that it can be done in a completely sound way. I think it would make more sense (as I think you suggested) to have a seperate ortion of the query packet reserved for implementation specific data. Using the RCPT data portion may inadvertently break implementations that put the actual RCPT domain in this area, and requiring all of the requests to be done over http is likely to be a pretty significant hurdle to convincing people to query your server. (On the second point I will admit that I am just guessing and would need to defer to people with expertise in getting SIQ querying implemented by various systems).
RCPT domain in query --AprilDL, Thu, 04 Nov 2004 11:52:27 -0500 reply Sendmail libmilter presently isn't capable of handling a multi-recipient message as if there were a copy for each domain that could be handled different - different header writing / accept / reject / etc. To do so in MS Exchange plug in is also too time consuming for our first pass at developing one also. So at present we accept that if you are sharing an MX, you are sharing the same SIQ processing - due to multi-rcpt messages.
Re: RCPT domain in query --Anthony Howe, Sat, 20 Nov 2004 03:51:42 -0500 reply § Concerning multiple RCPTS for a message. Its an awful problem to solve. If you make multiple queries for a single message, one per RCPT domain, then how do you deal with cases were some RCPT domains have tuned their scoring to say reject, others discard, still others say tag, and finally some say accept. In the case of reject/discard, you can simply drop a RCPT from the RCPT list, but in the case of tag and accept conflicts, there is no way to generate new messages per RCPT tailored to reflect their desires, especially in a pre-DATA milter. So you could implement some sort of majority rule, with ties being resolved by a sys.admin. configured choice. § Concerning the meaning and interpretation of scores: I think there should be a universal strict definition for the SCORE, an overall composite score that makes a judgement (how is not specified). The other supporting scores (IP-SCORE, DOMAIN-SCORE, REL-SCORE) used in the generation of the SCORE can and should have weaker definitions, since they will pertain and reflect specific data collected by a SIQ server and possibly adjusted by user preferences. I think its necessary to provide one SCORE field that goes out on a limb and makes a judgement, as this simplifies client side implementation for basic service. Refined client implementations could take the other scores and generate their own composite judgement and ignore the SCORE or make a comparison. Having one well defined SCORE that gives a judgement (-1..100) is already far better than a binary answer provided by blacklists - it allows for shades of grey that the client MX can work with and tune.
SIQ Response Scores --Anthony Howe, Sat, 20 Nov 2004 04:41:58 -0500 reply
Domain / IP: public or private? --Anthony Howe, Sat, 20 Nov 2004 05:06:53 -0500 reply http://www.dss.state.ct.us/digital/eupriv.html Apparently the original text is full of exceptions and may have had addendums since, but I haven't found any clear text as to how it impacts anti-spam filters. Also this document http://www.loeb.com/CM/Articles/articles19.asp is interesting too as it provides a summary of the Safe Harbor mechanism concerning US/EU relations. I'm going to put forward this argument concerning a domain and IP: a. An IP address is assigned temporarily (dynamic DNS) or long term (static), but it is something that has to be requested for and allocated. Long term IP allocations are in the public record (but the ipwhois information has also been recently limited by RFC 3912). b. A data subject or business rents a domain name for the point of simplifying how they are found on the Internet, this rental information (whois) is in the public record (though some elements of this have recently changed with RFC 3912). Now an IP address is required if you want to be on the Internet, otherwise its impossible to function, but in theory a domain name is optional, voluntary, and a clear indication of one's intent to be found for some purpose. Having an IP address is similar to having a street address; a domain name is similar to hanging a sign outside your shop with your business name on it or giving a name to your villa or farm. You need the IP, but you can live without a domain name (a return to 1970's Internet). I would argue that an IP address is not as revealing about a data subject, especially now that RFC 3912 allows for ipwhois servers to limit what they reveal to the public with respect to their national privacy laws, such as real world address and phone numbers. So querying and IP based blacklist reveals little or nothing about a data subject other than maybe their name. If an IP can be consider public information and having a domain is considered to indicate some willingness to be more easily found on the Interent and therefore public information, then I see using a domain name and IP address in a query as not being able to communicate anything of value concerning a data subject that hasn't already been revealed by the data subject themselves. In the case of email, as mentioned in the Security Consideration of the SIQ protocol draft: Similar information already appears in message trace headers and those headers may have already been viewed and logged by intermediate MX servers during transit. Taking this perspective, the queries make use of information that may have already been revealed else where.
Domain spoofing and privacy --AprilDL, Sat, 20 Nov 2004 18:44:43 -0500 reply
Re: Domain spoofing and privacy --Anthony Howe, Sun, 21 Nov 2004 10:47:59 -0500 reply
SIQ response scores --Robert Barclay, Thu, 02 Dec 2004 11:52:12 -0500 reply I also suggested that there should be a defined set of known data elements that can be published and a mechanism to extend that set (ideally both a private mechanism for interaction between sstems that know each other and a public mechanism to extend the known set). None of this is mean to say that any specfic reputation system will use more than a small set of the known data elements of course.
Domain spoofing and Privacy --Robert Barclay, Thu, 02 Dec 2004 12:00:58 -0500 reply
Re: SIQ response scores --Anthony Howe, Thu, 09 Dec 2004 06:52:30 -0500 reply |
|
| This page was last edited 2 years ago by AprilDL. | View page history | Edit this page |