==== Grouper Data Integration with PeopleSoft @ UNC Chapel Hill
Ethan Kromhout, 2 November 2022

I wanted to talk about how our data currently gets from Peoplesoft to Grouper.
There's maybe a story that I didn't include about how Grouper data gets indirectly to peoplesoft. But maybe I can comment about that at the end.

We've got three main paths the data goes through to get from from Peoplesoft to to Grouper.

The first path, the the largest and most complex, is our our homegrown Java app. We call it Directory Manager, but it's basically been
the place where the business logic happens for deciding people's, affiliation states and things like that, and getting those things published out to, particularly, the OpenLDAP Directory, but also Active Directory.

image::images/unc3flows.jpg['The 3 paths from Peoplesoft to IAM infrastructure', width=700]

So that path through Directory Manager includes our oldest integration, which is with Campus Solutions. We brought up Campus Solutions in 2009 well ahead of bringing up the HR And Finance modules. The integrations are messaging and the reason for those two different transports are that the original Campus Solutions integration was all messaging.

The original Campus Solutions integration was all messaging based, and then we just found that later a polling strategy was perhaps at least as effective as the messaging we were doing.
So when we did the integrations with HR And Finance, those were done as as SOAP transport integrations and either way it's It's pretty close to real time. So the the information that's pushed by messaging. And unless there is a ton of changes that back things the updates are near real-time.

Those that are polling or polling every five minutes are an acceptable form of data integration. But a second path that we use is Informatica, an ETL tool. So the main thing that's used for is for developers in the group that runs PeopleSoft to write queries,
and then either push those out the different locations, or create apis in Informatica for people to retrieve data without having direct database access.
The developers do a fair amount of querying of things in those databases, and then pushing information into Grouper for a couple of use cases i'll talk a little bit more about.

Is it near real time for those other integrations? In the case of the Informatica ETLs, it's really up to the developer. Most of those are things that run on kind of a daily cycle. But,,, the developer in Informatica has the capability to schedule things as often as they want them to run. It really is up to them.

The business logic to publish those into reasonable affiliations happens inside Directory Manager. The Directory Manager also does those SOAP queries that I mentioned. So that's the direction we decided to go with for
HR and Finance.

It connects to HCM and queries for jobs and associations with the construct that we described here. For the
poorly worded affiliate status which is basically your sponsored researchers,
contractors, anybody who you can't really say is an employee of the university, but still has quite a formal relationship with the University goes through what we call our affiliate process. That creates those associations.

The Directory Manager is is what they call every five minutes or so
for any new jobs and associations that are available. And then also, once a week, we say, Okay, cycle through and give me everybody so that it can do a full synchronization.
And then there's just a very simple query over to Peoplesoft Finance, really. Only thing that I am cares about from
our Finance installation is what are the department names? So it gets members department members that are associated with jobs or associations or student status. But it's nice to have a friendly name to associate with those department members, and so it retrieves those friendly names from our our Finance install.

And then Directory Manager is responsible for its own OpenLDAP instance running kind of locally to that application, and then we use the built-in open all that sync where to? To? To to that out to our our large open
installation, and then finally, That's where Grouper can run. It's Grouper loader jobs and retrieve things from LDAP. So just like we run Grouper loader jobs for databases. We've got these that run directly against our open laptop installation, and that gives group or,
first of all, it's subject source,
but also affiliations with student type and departments all come through that instance. So these are fairly indirect as is obvious. But as I mentioned, they perform well, this just works. It's just kind of
old and very C specific.

The second flow we have mentioned is through Informatica,
and it's doing SQL queries into any of the big towers along with some other data sources. Honestly, it's just here we're concerned about Peoplesoft. Informatica has the capability to
push those in the group or via the group or web services. So a lot of the roles that our PeopleSoft security folks create are useful in other applications besides just inside PeopleSoft, so they they function. As for
proxies to other data that people might have access to because of the roles that they have. And and one of the Peoplesoft powers. So, for example, they can go to a data warehouse.

image::images/uncInfMat.jpg['Informatica path', width=700]

Instead of having a separate role structure for the data warehouse the people self-security folks would rather just replicate those roles that they have, and people solve the Grouper, and then ! It can be queried, or L. That can be queried
to retrieve that information about those roles. We also do use this for some special group that our Hcm. People need to keep track of, anyway. So an example that I off the top of my head is who's in a HIPAA-related department? So are you in a HIPAA covered entity?
Because there's a number of of things that want to know that, for example, our our zoom installation, needs to know if you're if or not, because it turns on and off. You know, certain capabilities inside of zoom.
So
all of those are published. A group or many of them are then published to that to be consed. But this gives group or some knowledge about internal people, self-security information as well as those kind of edge case groups that we haven't come across a friendly, her way to to get replicated out for consption to be a group,

I should have said this before, but please interrupt with questions as we go along, because I know these paths are fairly divergent.

Q: So it's a really, really quick one about the diagram that you're showing that Are the arrows correct? Are you taking data from and from Fromatica and sending it to people talk? Or is it the other way?

A: I debated which way to point these arrows, but this is our SQL query, so that is Informatica reaching out to Peoplesoft with the SQL query and pulling data back.

The third flow is just direct Grouper loader jobs that people saw from that group or a queries people. Probably the courses and course roles are the main use of these queries.

image::/images/uncLoader.jpg['Grouper loader job path', width=700]

We do publish all of our courses out to Grouper each semester, and then break out
the different roles inside the courses. So, student, faculty, primary
teaching assistant, all of those kinds of roles.

Currently, the majority of our courses are in Sakai but we are trying to migrate to Canvas. So we have something a little bit larger than pilot groups on campus at this point.

- - -