Ao usar o InterSystems IRIS como mecanismo de interoperabilidade, todos sabemos e amamos como é fácil usar o Message Viewer para revisar rastreamentos de mensagens e ver exatamente o que está acontecendo em sua produção. Quando um sistema lida com milhões de mensagens por dia, você pode não saber exatamente por onde começar sua investigação.
Ao longo dos meus anos apoiando as produções da IRIS, muitas vezes me peguei investigando coisas como...
- Que tipo de rendimento esse fluxo de trabalho tem?
- Onde está o gargalo?
- Quais são meus erros mais comuns?
Um dos meus lugares favoritos para procurar pistas é a tabela Message Header, que armazena metadados sobre cada mensagem executada no sistema. Estas são as mesmas mensagens que aparecem no Visualizador de Mensagens e nos Rastreamentos Visuais.
Criei uma coleção de consultas SQL úteis e adoraria compartilhá-las com você. Meus exemplos são principalmente de casos de uso de HealthShare ou IRIS for Health, mas podem ser facilmente adaptados para qualquer fluxo de trabalho que você tenha...
-- SQL query to find the # of messages through a component per day select {fn SUBSTRING(timeprocessed,1,10)} AS day, count(*) MessagesThisDay FROM Ens.MessageHeader where TargetConfigName = 'HS.Hub.Push.Evaluator' GROUP BY {fn SUBSTRING(timeprocessed,1,10)} ORDER BY day ASC
-- SQL query to find long-running messages through particular components SELECT PReq.SessionID as SessionId, PReq.TimeCreated as pReqTimeCreated, PRes.TimeCreated as pResTimeCreated, {fn TIMESTAMPDIFF(SQL_TSI_SECOND, PReq.TimeCreated,PRes.TimeCreated)} as TimeDelay FROM ( SELECT ID, SessionId, TimeCreated FROM Ens.MessageHeader WHERE MessageBodyClassName = 'HS.Message.PatientSearchRequest' AND SourceConfigName = 'HS.Hub.MPI.Manager' AND TargetConfigName = 'HUB' ) as PReq INNER JOIN ( SELECT ID, SessionId, TimeCreated FROM Ens.MessageHeader WHERE MessageBodyClassName = 'HS.Message.PatientSearchResponse' AND SourceConfigName = 'HS.Hub.MPI.Manager' AND TargetConfigName = 'HS.IHE.PIX.Manager.Process' ) as PRes on pReq.SessionId = PRes.SessionId WHERE {fn TIMESTAMPDIFF(SQL_TSI_SECOND, PReq.TimeCreated,PRes.TimeCreated)} > 1 ORDER BY SessionId desc ----------------------------------------------------------
/*-- Query to find the bottleneck message through a particular component -- set your threshold for "how long is too long (e.g. 20 seconds) -- look for clusters of messages that are longer than that (e.g. the first cluster started at 3:22:00, then there was a second cluster at 5:15:30) -- in each cluster, look at the first message in that cluster (chronologically). That is likely to be the bottleneck message, and all messages after it are victims of its bottleneck */ SELECT %NOLOCK req.TargetConfigName, req.MessageBodyClassName, req.SessionId, req.TimeCreated, req.TimeProcessed, {fn TIMESTAMPDIFF(SQL_TSI_SECOND, req.TimeCreated, req.TimeProcessed)} as TimeToProcess FROM Ens.MessageHeader AS req WHERE req.TargetConfigName = 'HS.Hub.Management.Operations' AND req.TimeCreated BETWEEN '2021-04-21 00:00:00' AND '2021-04-21 11:00:00' AND {fn TIMESTAMPDIFF(SQL_TSI_SECOND, req.TimeCreated, req.TimeProcessed)} > 20
/* If you have a particular error that you're investigating, try this one. It scans through the Ensemble Error Log for "Object to Load not found" entries, then returns some key fields from the relevant PatientSearchRequest message */ SELECT l.SessionId, mh.MessageBodyID, mh.TimeCreated, psr.SearchMode, psr.RequestingUser, psr.FirstName, psr.MiddleName, psr.LastName, psr.SSN, psr.Sex, psr.DOB FROM Ens_Util.Log as l INNER JOIN Ens.MessageHeader as mh on l.SessionId = mh.SessionId INNER JOIN HS_Message.PatientSearchRequest as psr on mh.MessageBodyID = psr.ID WHERE l.Type = 'Error' AND l.ConfigName = 'HSPI.Server.APIOperation' AND l.Text like 'ERROR #5809: Object to Load not found%' AND mh.MessageBodyClassName = 'HS.Message.PatientSearchRequest' AND mh.SourceConfigName = 'HSPI.Server.APIWebService' AND mh.TargetConfigName = 'HSPI.Server.APIOperation'
-- Scan the Ensemble Error Log for a particular timeframe. Count up the different types of errors SELECT substring(text,1,80) as AbbreviatedError, count(*) as NumTheseErrors FROM Ens_Util.Log WHERE Type = 'Error' AND TimeLogged > '2022-03-03 00:00:00' -- when the last batch started AND TimeLogged < '2022-03-03 16:00:00' -- when we estimate this batch might end GROUP BY substring(text,1,80) ORDER BY NumTheseErrors desc
-- Find the Gateway Processing Time for each StreameltRequest / ECRFetchResponse pair SELECT sr.Gateway,request.sessionid, response.sessionid, request.timecreated AS starttime, response.timecreated AS stoptime, datediff(ms,request.timecreated,response.timecreated) AS ProcessingTime, Avg(datediff(ms,request.timecreated,response.timecreated)) AS AverageProcessingTimeAllGateways FROM Ens.MessageHeader request INNER JOIN Ens.MessageHeader AS response ON response.correspondingmessageid = request.id INNER JOIN HS_Message.StreamletRequest AS sr ON sr.ID = request.MessageBodyId WHERE request.messagebodyclassname = 'HS.Message.StreamletRequest' AND response.messagebodyclassname = 'HS.Message.ECRFetchResponse'