Rules as Code Architecture Diagram, Random thoughts

Used this slide in a presentation, today, and thought I would ask here if anyone had any thoughts, good, bad, or indifferent, on it:

The basic concept is that in Rules as Code you want to isolate the legal reasoning from the application, so that the legal reasoning tool can be re-used in many different applications, and for many different purposes. (The specific purposes it can be put to are dependent on its features, but regardless of which features it has, this same architecture would apply, I think.)

It’s not possible to completely isolate it, though. The encoding is going to have inputs that it is able to deal with, and outputs that it is able to generate. Those may change over time, if the rules change. (They won’t always change when the rules change, but they often will.)

Similary, the application may change in terms of the information it’s collecting, or what it wants to do with the results. So there may be a need to modify the middle layer if the data structure of the laws changes or the data structure of the application changes.

But that translation can be isolated, so that you know where it is, and what needs to be fixed when things change.

I anticipate that the task of building the middle layer will fall, most commonly, on the application developer, because it doesn’t require any substantive knowledge of the rules themselves, just substantive knowledge of the intentional meaning of the variables used in the rules’ data structure. But I’m not super confident of that. If it does, there is an obvious risk that it isn’t isolated from the rest of the application. Which is OK, if that ends up being how they want to proceed. But it makes it more difficult to know what parts of the application will need updating when the data structure being used by the law is adjusted.

I’m also currently sort of struggling with the idea that doing it this way, with two additional layers of architecture, and writing far more code for the Rules than you would otherwise need (for the purposes of a single application), means that it will never get done for the purposes of a single product.

The encoding has to be the product, or there need to be so many products that are using the same rules, that change frequently enough or carry some other risk, that a single person saves money by refactoring them all into this architecture. The likelihood of that seems really low. So it feels like maybe it’s not realistic to expect people to volunteer for that, and it falls on the people who are writing the laws to do it that way because it benefits others. Which is not a great adoption strategy.

I know a banking industry organization in the UK tried some rules as code work on behalf of their members, but I don’t know how far it got.

Anyway, random thoughts for a Friday.

Hey @jason , I find myself here after a series of jumps through chat channels and links.

I think you’re wrong in thinking the middle layer will fall on the application programmer.

As you pointed out, the middle layer will actually be quite coupled with the output of the reasoning layer. Due to there being many prolog implementations, or reasoning engines, the middle layer must be able to understand them.

As for inputs, the middle layer can use a standard for input, such as JSON. This way many different applications can interface with the reasoning layer.

You can encode queries via JSON to satisfy many usecases. This is a common usecase with GraphQL.

As time moves forward, the middle layer will grow in support for backend engines, and support for input and output types. It can remain general to any reasoning problem, it’s just a data transformer.

1 Like

Welcome, @lf94, and thanks for the feedback!

That’s a valid point, but I’m not talking about a middle layer the purpose of which is to abstract away which reasoner is being used. I’m talking about a middle layer, the purpose of which is to convert data in the format used internally by your application into the format used by the query endpoint, whether JSON, or a GraphQL query, or a SQL query, or something else.

Does that make a difference to your point?

I think that makes no difference, the middle layer should handle both. The application should adhere to what the reasoner layer ultimately offers.

I’m not following.

If I offer a database with a public API, I am responsible for advising users of the schema and the query language. I’m not responsible for writing the queries.

The whole point of standardizing on things like SQL is so that the person building the database does not need to be involved in writing the queries and vice-versa.

Why would it be different than that for Rules as Code?

Likewise, I’m not sure why there is focus on writing the queries now in your response. :laughing:

I am essentially proposing an SQL-like interface, such that the person building the “logical database” doesn’t need to care about how information comes in / is queried. I think we agree on this.

I think what may be giving us a mismatch in vision is, I envision the schema of a Prolog database to be just a listing of all possible procedures, and any queries are just specially structured JSON which transform into regular Prolog questions / goals.

This would mean just two layers basically: the Prolog layer and the translation layer, which are going to be somewhat coupled.

Re-considering the diagram I think maybe I’m agreeing with it, just not the explanation around it :stuck_out_tongue:

Edit: Ah yeah, if I remember my original point, I still think it’s right:

I think you’re wrong in thinking the middle layer will fall on the application programmer.

Yeah, I’m convinced we are envisioning the same thing, but calling it different things.

The diagram is abstract, but let’s take prolog as a concrete example. On the left, you have an encoding of a law, and the reasoner, exposed over API.

The encoding represents the concepts in the law. The law represents objects in the real world. So the encoding has a mapping between the real world and the predicates it uses, the intended meanings of the parameters to this predicates, etc. That is data schema on the left side.

On the right side, you have an application that collects information from a user and provides legal advice. The purpose of Rules as Code is to facilitate writing that application without knowing what the rules are or how they are implemented on the left.

The choice of what data structure you use in the app is going to depend on application-specific features, but whether it’s a relational database, or objects, or a graph, or what have you, it is a different representation of some of the same real-world things the law refers to.

The middle layer represents the task of doing translations between the representation on the right and the representation on the left. That is where you need to know what the predicates are, and what their parameters are supposed to mean, so that you can generate facts and queries to send to the prolog reasoner.

The point of the diagram is to illustrate that the application developer doesn’t need to know what the rules are, but they do need to know the predicates, and parameters to use. Which is why I say the code in that middle layer is written by the application developer, not the person who deploys the reasoner.

Where are we getting mixed up?

I think we’re getting mixed up on “application developer”. In my mind the application developer will develop lets say, a desktop application, which will communicate with the translation layer.

I can see though another sense of this is “application developer” can be a developer who creates the backend which communicates with the translation layer which communicates with the reasoner.

There’s a lot of ways to “cut it up” I guess. In my mind the translation layer will benefit the most from a tight coupling to the reasoner, and be implemented as a server which is publicly accessible.

In the end though I don’t see the point behind all this translation. Why not have say prolog answer directly to web requests?.. It would be super minimal to translate goal returns into JSON representation for consumption by web clients. The “translation layer” would then be extremely thin, which IMO is a strength because it exposes the full power of Prolog / the reasoner, and reduces complexity.

In my mind, that’s exactly what the diagram is describing.

I get the feeling that you are assuming that this middle layer needs to be on the server-side, or deployed on the web, or something. It’s more abstract than that. It could be, but it could also be that thin layer of code inside the desktop application that generates the JSON that gets sent to the prolog server, and processes the results received from the prolog server online.

The point is that someone, somewhere, is going to have to write code that translates between the data structure being used in the application, and the data structure being used inside the rules, and back, that person is probably going to be the application developer, and they will not be able to do it unless they know what the data structure used in the rules actually is, and how it maps to the real world.

Does that make sense? And if so, is there something about the diagram that caused the confusion that I could fix?

The discussion here is focused on technical and architectural perspective but may be missing an important legal point. I’ll be writing from perspective of Polish legal system, I’m not sure if it’s the same in Canada; if not then pardon me. Getting to the point: the State as an entity issuing law has an obligation to promulgate it and to do it in a way that every citizen is able to get to know it. The Constitution considers this a condition of law coming into force. The exact details how to promulgate the law are described in a separate act. But the effort taken by the citizen should be minimal. Therefore the point of view by @lf94 seems to fit here better, especially the part about coupling of the translation layer and the reasoner. Having the developer write the translation layer for the application seems like a complication that is not only unnecessary but even dangerous. Let’s imagine that law is being published i.e. in Latin. Unless you know the language well enough you will have to use a translation, probably unofficial. In fact any translation from the original text will pose a risk of misunderstandings. Therefore an official translation is so much needed. This problem is even greater in EU where official legal texts exist in many languages. There were cases that those texts differed significantly.
But there’s one more thing that made me wonder – the separation of Reasoning Engine. To be clear: I completely agree that it’s necessary. But the legal reasoning may vary depending on area of law. The simple example is the consequence of breaking the existing rules. In administrative law the sanction is almost automated, there’s no proving of guilt like in criminal law. So there may be a need for separate Engines for different areas of law. And of course there’s a potential discussion which one to use in case of regulations not being explicitly assigned to the specific area of law.

Issues of how law is promulgated mater only with regard to the law. The code is no more the law than a tax form, or a guide for applying for a pension. They are all documents that reflect an interpretation of the law, but no one would confuse with the law itself. The code just also happens to be executable by machines, but that is by no means a criteria for whether it needs promulgation.

Encoded law is a formalized model of an interpretation, just like many other systems governments currently use to administer laws. We don’t promulgate tax forms.

If the legislatures decided to enact encoded legislation, totally different story. But that seems like a far-off fantasy.

As for the reasoning types, I am of two minds. Yes, we need different reasoning engines for different use cases. That is undoubtedly true, but the diagram doesn’t suggest otherwise. For any reasoner, and any encoding, and any use of that encoding, there will be a thin layer of data translation between the application and the reasoner.

That said, the difference between admin and criminal law is not where the need for different reasoners will arise. That difference will be reflected in different encodings. They are different laws, that behave differently, and so the encodings should reflect that. The reasoner acts at a much lower level of semantics than that. There are no criminal law reasoners, or administrative law reasoners. The reasoners differ in terms of the semantics they implement, and their search algorithms. One reasoner may be well founded, another stable model semantics, another answer set programming, another lambda calculus with defaults, etc. You would choose among them for technical reasons, not legal ones. Criminal and administrative rules are at a higher level of abstraction, and would be expressed in the code, not the reasoner.

So we do need different reasoners, and we need to represent criminal and admin law differently, but we don’t need different reasoners because of the need to express criminal and admin law differently.

Also, criminal law is extremely vague, dependent on facts that it may be impossible to anticipate, such as evidence of motivation, and subject to strong disagreement as to the facts, their probativity, and their persuasiveness.

There are places that Rules as Code can help in criminal law, short-term, but coverting facts into liability conclusions is probably not among them.