Share The Content Strategy Experts - Scriptorium
Share to email
Share to Facebook
Share to X
By Scriptorium - The Content Strategy Experts
4.3
77 ratings
The podcast currently has 183 episodes available.
Is it really possible to configure enterprise content—technical, support, learning & training, marketing, and more—to create a seamless experience for your end users? In episode 177 of the Content Strategy Experts podcast, Sarah O’Keefe and Bill Swallow discuss the reality of enterprise content operations: do they truly exist in the current content landscape? What obstacles hold the industry back? How can organizations move forward?
Sarah: You’ve got to get your terminology and your taxonomy in alignment. Most of the industry I am confident in saying have gone with option D, which is give up. “We have silos. Our silos are great. We’re going to be in our silos, and I don’t like those people over in learning content anyway. I don’t like those people in techcomm anyway. They’re weird. They’re focused on the wrong things,” says everybody, and so they’re just not doing it. I think that does a great disservice to the end users, but that’s the reality of where most people are right now.
Bill: Right, because the end user is left holding the bag trying to find information using terminology from one set of content and not finding it in another and just having a completely different experience.
Related links:
LinkedIn:
Transcript:
Bill Swallow: Welcome to The Content Strategy Experts podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about enterprise content operations. Does it actually exist? And if so, what does it look like? And if not, how can we get there? Hi, everyone. I’m Bill Swallow.
Sarah O’Keefe: And I’m Sarah O’Keefe.
BS: And Sarah, they let us do another podcast together.
SO: Mistakes were made.
BS: So today we’re talking a little bit about enterprise content operations. If it exists, what it looks like. If it doesn’t, why doesn’t it exist? What can people do to get there?
SO: So enterprise content ops, I guess first we have to define our terms a little bit. Content operations, content ops is the system that you use to manage your content. And manage not the software, but how do you develop it, how do you author it, how do you control it, how do you deliver it, how do you retire it, all that stuff. So content ops is the overarching system that manages your content lifecycle. And when we look at content ops from that perspective, and of course we’re generally focused on technical content, but when we talk enterprise content ops, it’s customer-facing content, which includes techcomm, but also learning content, support content, product data potentially, and some other things like that. And ultimately, when I look at this, again bringing the lens back or going back to the 10,000-foot view, we have some enterprise solutions but only on the delivery side. The authoring side of this is basically a wasteland. So I have the capability of creating technical content, learning content, support content, and putting them all into what appears to be some sort of a unified delivery system. But what I don’t really have is the ability to manage them on the back end in a unified way, and that’s what I want to talk about today.
BS: So those who are delivering in that fashion, so being able to provide customer-facing information in a unified way, as far as their system for content ops goes, it’s more, I would say, human-based. So it’s a lot of workflow. It’s a lot of actual management of content and management of content processes outside of a unified system.
SO: So almost certainly they don’t have a unified system for all the content, and we’ll talk about why that is I think in a minute. It’s not necessarily human-based, it’s more that it’s fragmented. So the techcomm group has their system, and the learning group has their system, and the support team has their system, et cetera. And then what we’re doing is we’re saying, okay, well once you’ve authored all this stuff in your Snowflake system, then we’ll bring it over to the delivery side where we have some sort of a portal, website portal, content delivery CDP that puts it all together and makes it appear to the end user that those things are all in some sort of a, it puts it in a unified presentation. But they’re not coming from the same place, and that causes some problems on the backend.
BS: Right, and ultimately the user of that content doesn’t really care if it’s a unified presentation. They just want their stuff. They don’t want to have a disjointed experience, and they want to be able to find what they’re looking for regardless of what type of content it is.
SO: Right, and the cliche is “don’t ship your org chart,” which is 100% what we’re doing. And so let’s talk a little bit about what does that mean, what are the pre-reqs? So in order to have something that appears to me as the content consumer to be unified, well for starters, you mentioned search. I have to have search that performs across all the different content types and returns the relevant information. And what that usually means is that I have to have unified terminology. I’m using the same words for the same things in all the different systems. And I need unified taxonomy, classification system metadata so that when I do a search, everything, and maybe I’m categorizing or I’m classifying things down and filtering, that when I do that filtering, that the filtering works the same way across all the content that I’ve put into the magic portal. So taxonomy and terminology are the things that’ll make your search, relatively speaking, perform better. So we have this on the delivery side and that’s okay-ish, or it can be, but then let’s look at what we’re doing on the authoring side of things because that’s where these problems start.
BS: So what do they start looking like?
SO: Well, maybe let’s focus in on techcomm and learning content specifically. We’ll just take those two because if I try and talk about all of them, we’re going to be here for days and nobody wants that. All right, so I have technical content, user guides, online help, quick snippets, how-tos. And I have learning, training content, e-learning, which is enabling content, I’m going to try and teach you how to do the thing in the system so that you can get your job done. Now, let’s go all the way back to the world where we have an instructional designer or a learning content developer and a technical content developer. So for starters, almost always those are two different people, just right off the bat. And instructional designers tend to be more concerned with the learning experience, how am I going to deliver learning and performance support to the learner? And the technical writers, technical content people tend to be more interested in how do I cover the universe of what’s in this tool set, or this product, and cover all the possible reasonable tasks that you might need to perform, the reference information you need, the concepts that you need? It’s a lot of the same information. It’s there’s a slightly different lens on it. And in the big picture, we should be able to take a procedure out of the technical content, step one, step two, step three, step four, and pretty much use that in a learning context. In a learning context, it’s going to be, hey, when you arrive for your job at the bank every morning you need to do things with cash that I don’t understand. And here’s a procedure, and this is what you’re going to do, steps 1, 2, 3, 4, 5, and you need to do them this way and you need to write them down, and it tends to be a little more policy and governance focused, but broadly it’s the same procedure. So there should be the opportunity to reuse that content. And big picture, high-level estimate is probably something like 50% content overlap. So 50% of the learning content can or should be sourced from the technical content. The technical content is probably a superset in the sense that the technical content covers, or should cover, all the things you can do, and training covers the most common things or the most important things that you need to do. It probably doesn’t cover a hundred percent of your use cases. Okay, so now let’s talk about tools.
BS: Right because I was going to say these two people, the technical writer and the training developer, they are using, at least historically, two very different sets of tools to get their job done.
SO: Right. So unified content solutions, without getting into too many of the specifics, which will get me in big trouble, basically the vendors are working on it, but they’re not there yet. There’s a lot of point solutions. There’s a lot of, oh yes, we have a solution for techcomm and we have a solution for learning and we have a delivery solution, but there’s not a unified back end where you can do all this work.
And some of the vendors have some of these tools in their stable, some of them don’t. But from my point of view, it doesn’t really make a whole lot of difference whether you buy two-point solutions from separate vendors or from the same vendor because right now they’re disconnected.
BS: They’re two-point solutions.
SO: Yeah, they’re all point solutions. So it’s not good. And then that brings us to how can we unify this today? What can we do and what kind of solutions are our customers building or are we building with our customers? So a couple of things here. Option A is you take your structured content solution and you say, “Okay, learning content people, we’re going to put you in structured content. We’re going to move you into the component content management system. We’re going to topicify all your content, and we’re basically going to align you with the techcomm toolset and make that work.” We have a few customers doing that. It works well for learning content developers that are willing to prioritize the document structure and process over the flexibility in the downstream learning experience.
BS: Right.
SO: That’s a small set of people. Most learning content developers are not willing to prioritize efficiency and structure over delivery, which I think is actually the root cause.
BS: Right. Now, those who are doing this, they are seeing some benefit in being able to produce a wide variety of their training deliverables from that unified source. But again, it comes back to how willing people are to give up the flexibility that they have in developing course content.
SO: We can talk about big picture and we can talk about all the things, but this decision, this approach 100% of the time comes down to how badly do you want to be able to flail around in PowerPoint. And if having the ability to put random things in random places on random slides is critical, then this solution will not work.
BS: So on the flip side, you would then look to maybe somehow connect your technical communication system to your learning repository.
SO: Right. So you take your techcomm content and you treat it as a data source essentially for your learning content, and you just flow it into the learning authoring environment. It turns out that’s hard.
BS: It’s very hard.
SO: Super difficult. It’s difficult to get your structured content out into a format that the learning content system can accept in a reasonable manner.
BS: And if your content is highly structured, you’re likely losing a lot of semantic data along the way to get it there.
SO: Yeah, you lose a lot, but it’s just bad. And ultimately, this almost always lands, I mean we talk about flow it in there, but ultimately this almost always means that you’re going to be copying and pasting and reformatting and re-reformatting, and it’s just terrible.
BS: So more often than not, we’re not seeing this level of unification then.
SO: Yeah, I mean, are you connecting your techcomm and you’re learning in a structured environment? A few people, yes. And for the right use case, it’s great. Or flow the techcomm content down into the learning environment, but ultimately not worth it, we’ll just copy and paste. So in terms of unification, basically none of the above, right?
BS: Mm-hmm. So how would people get there?
SO: So there’s a couple of options. The probably most common one is some sort of a DIY solution. We’re going to find a way to glue these systems together. We’re going to find a workflow that involves converting the techcomm content, which usually is created first and move it into the learning content. Again, for the right group, for the right environment, unifying everything in a structured authoring environment makes a lot of sense. I think ultimately that’s where it’s going to land, but the structured content systems need to do some work to make themselves into what amounts to a reasonable viable authoring solution for the learning content people. Basically the learning content people are not willing to put up with the shenanigans that ensue in order to use a structured content system. And I’m not even sure they’re wrong, right?
BS: Yeah.
SO: They’re just saying, “No, this is terrible and we’re not doing it.” Okay, well, that’s fair. So either you tinker and put it all together in some way. Option B is wait for the vendors, wait for the vendors to fix this problem, fix this requirement, and deliver some systems that have a solution here. And it’ll be a year or two or five or 20, and eventually they’ll get to it. You can go with a delivery-only solution, so we’re only going to solve this on the delivery side. If you do that, you really, really, really, really need an enterprise-level taxonomy and terminology project group.
BS: Absolutely.
SO: You’ve got to get that aligned. You cannot go around having half your text say entryway, and half your text say hallway, half your text says study, and half your text says den. And I’m halfway down a clue reference, was it the wrench or the outlet? No, no, no, okay. You have to get your terminology in alignment. You must because otherwise people search on oven and it doesn’t return range because those are in fact… Well, okay, they’re not exactly the same thing, but close enough, so those types of things. So you’ve got to get your terminology and your taxonomy in alignment. Most of the industry, like most of the people out there that are doing techcomm and learning content, I am confident in saying have gone with option D, which is give up. Just don’t do it. Just don’t bother. We have silos. Our silos are great. We’re going to be in our silos, and I don’t like those people over in learning content anyway. I don’t like those people in techcomm anyway. They’re weird. They’re focused on the wrong things, says everybody, and so they’re just not doing it. I think that does a great disservice to the end users, but that’s the reality of where most people are right now.
BS: Right, because the end user is left holding the bag there trying to be able to find information using terminology from one set of content and not finding it in another and just having a completely different experience.
SO: They make it a you problem.
BS: Yeah. So if you’re seeing opportunities to unify content operations in your organization, what are some key ways of communicating that up so that you can begin to get some funding, some support, some executive level buy-in to do these things?
SO: The technology problem is hard. Putting everybody in an actual unified authoring environment is a really hard problem. So I think what you want to do is go for the easier solutions where you can get some wins. And the easier solutions where you can get some wins are consistent terminology across the enterprise. So we’re going to have some conversations about terminology and what we need to do in terms of terminology, and everybody’s going to agree on the words we’re going to use. Taxonomy, what does our classification system look like? What are the names for our products and how do we label things so that when we deliver all these different content chunks, they’re coming from all these different systems, we can bring them into alignment? I mean, you can do the work on the back end to align taxonomy or you can do it on the delivery side to say these things are synonyms. So there are some ways of addressing this even when you get down into the delivery end of things. But I think what you want to do is start thinking about the things… Oh, and translation management, which ties into both terminology and taxonomy. I think you want to start maybe with those things and then slowly work your way upstream, like a salmon, avoiding the bears on the… Okay, you’re going to try and work your way upstream towards the authoring. Because ultimately if you look at this from an efficiency point of view, it would be much, much more efficient to have unified authoring and put it all together. It’s just that right now today, that’s a heavy lift and it only makes sense in certain environments. So what can we do to prepare for that so that when we do get to that point and those tools do start to unify a little bit better, we’ve done the legwork that’ll make it easier to make that transition as we go?
BS: Right. So it’s spending the effort to unify as much as you can the content and the language and the organization, as well as trying to keep pace with where I guess all of these different industry tools are going and making sure that you are making improvements in the right direction. So if you’re thinking about structured content, that you are keeping an open mind as to where and how I guess these other groups can start leveraging what you’re using and vice versa. And I guess talking with the other groups in your organization. So if you’re in techcomm, then talk to the training group, see what they’re doing, see what their plan is, what’s their five-year roadmap? Are they looking at certain technologies? How might that play into your development, and vice versa, being able to share that information.
SO: And I know, Bill, you’re doing a session on re-platforming at tcworld this November 2024. And when you’re thinking about re-platforming, what are some of the factors that you should be looking at there that tie into this?
BS: Well, it directly plays into that next step of we have a platform on the techcomm side, we bought it 12 years ago, it served our needs. But the training group, let’s say, has been talking and they have this other system that they’re not too happy with, and they want to see if they can start sharing our content.
Well, then you have an open conversation to say, “Okay, how can we get to a shared solution, what do these requirements look like,” and go ahead and pick a system that kind of meets both requirements. But then you have that heavy lift of just saying, “Okay, so now we have these two different old systems and we need to dump our content, and I use that very generally, into the new system, so that everyone from those two groups can now author in the same place.
SO: And I’m thinking as you’re evaluating these systems, all other things being equal, which they are not, but all other things being equal, you would look for the one that’s more open, that is more flexible knowing that things are going to change because they always do. What’s available to us that’ll give us maximum flexibility in a year or two or five when these new requirements come in that we have not anticipated at this point?
BS: Right, because you’re exiting your old systems because they are potentially inflexible. We cannot accommodate anything new. We can sustain what we’re doing indefinitely, but we can’t accommodate this new thing that we need to do.
SO: Yeah, it’s interesting because looking at the the techcomm landscape, we have a lot of customers and a lot of just generalized ecosystem that has moved into structured content, and starting as early as the late nineties or maybe even the early nineties in Germany, people were moving into structured content at scale. And now we’re looking at it and saying, “Okay, well there’s all this other content out there and we need to look at that and we need to look at whether we can bring that into the structured content offerings.” But not unreasonably, those other groups are looking at it and pushing back and saying, “This isn’t optimized for the kind of work that I do. It’s optimized for the kind of work that you people do. So how can we improve this and bring it into alignment with what the new and additional stakeholders need?” And it’s a hard problem, I really feel for the software vendors. It’s easy for us sitting here on the services side to say, “Hey, do better,” because we’re not doing the work.
BS: Very, very true. And at that point, you have a winner and a loser, and I hate to say it that way, but you have a winner and a loser on the system side at that point. Where you’re pulling one other group in because you have an established structural approach and they could benefit from it, but basically they have to absorb the brunt of the change that’s going to happen, and it’s not necessarily fair.
SO: Well, yeah. I mean, life isn’t fair. But also I’ll say that that pain that you’re talking about, the people that are now in structured content, they had that pain. It was just 10 years ago-
BS: Very true.
SO: …and they’ve forgotten. For those of you that were around and in this industry 10 years ago, or 20 years ago, or 25, I mean, remember what it was like trying to get people to move from you will pry unstructured FrameMaker from my cold, dead hands. You’ll pry Microsoft Word from my cold, dead hands. You will pry PageMaker, Interleaf, Ventura Publisher from my cold, dead hands.
BS: WordStar.
SO: Okay. So tools come and go, and the tool that is the state-of-the-art, BookMaster, for today is not necessarily the tool that’s going to be state-of-the-art for tomorrow or yesterday. I mean, basically this stuff evolves and we have to evolve with it, and we have to understand what are the best and most reasonable solutions that we can offer to a customer or to a content operations group in order to deliver on the things that they need to deliver on.
BS: Very true. So there are no unicorns.
SO: No unicorns, or maybe more accurately you can construct your own unicorn and it might be awesome, but it’s going to be a lot of work.
BS: So I think we could probably talk about this for hours because there are so many different facets that we can touch upon, but I think we’ll call it done for now, and maybe we’ll see you soon in a new episode?
SO: Yeah, if this speaks to you, call us because we’ve barely scratched the surface.
BS: All right. Thanks, Sarah.
SO: Thanks.
BS: And thank you for listening to The Content Strategy Experts podcast brought to you by Scriptorium. For more information, visit Scriptorium.com or check the show notes for relevant links.
The post Do enterprise content operations exist? appeared first on Scriptorium.
Whether you’re surviving a content operations project or a journey through treacherous caverns, it’s crucial to plan your way out before you begin. In episode 176 of the Content Strategy Experts podcast, Alan Pringle and Christine Cuellar unpack the parallels between navigating horror-filled caves and building a content ops exit strategy.
Alan Pringle: When you’re choosing tools, if you end up something that is super proprietary, has its own file formats, and so on, that means it’s probably gonna be harder to extract your content from that system. A good example of this is those of you with Samsung Android phones. You have got this proprietary layer where it may even insert things into your source code that is very particular to that product line. So look at how proprietary your tool or toolchain is and how hard it’s going to be to export. That should be an early question you ask during even the RFP process. How do people get out of your system? I realize that sounds absolutely bat-you-know-what to be telling people to be thinking about something like that when you’re just getting rolling–
Christine Cuellar: Appropriate for a cave analogy, right?
Alan Pringle: Yes, true. But you should be, you absolutely should be.
Related links:
LinkedIn:
Transcript:
Christine Cuellar: Welcome to the content strategy experts podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize and distribute content in an efficient way. this episode, we’re talking about setting your ContentOps project up for success by starting with the end in mind, or in other words, planning your exit strategy at the beginning of your project. So I’m Christine Cuellar, with me today is Alan Pringle. Hey, Alan.
Alan Pringle: Hey there.
CC: And I know it can probably sound a bit defeatist to start a project by thinking about the end of the project and getting out of a new process that maybe you’re building from the beginning. So let’s talk a little bit more about that. Why are we talking about exit strategy today?
AP: Because everything comes to an end. Every technology, every tool, and we as human beings, we all come to an end. And at some point, you are going to have tools, you’re gonna have technology and process that no longer supports your needs. So if you think about that ahead of time, and you’re ready for that inevitable thing, which will happen, you’re gonna be much better off.
CC: Yeah. So this conversation started around the news of the DocBook Technical Committee closing, and that’s kind of a big deal for a lot of people, and it kind of sparked this internal conversation about like, you know, what if that happened to you? How can people avoid getting caught by surprise? And of course, as Alan just mentioned, the answer to that is really to begin with the end in mind, to have an exit strategy because everything does end at some point. So this got me thinking about, you know, I don’t know, Alan, you’ve seen the horror movie The Descent, right? You’ve seen that movie? Yes, because it’s amazing and it’s a horror movie and it’s awesome. So it me kind of think of that because, you know, this group, and I’m not going to spoil it, no spoilers for people who haven’t seen it yet, but, if you haven’t, go watch it. The first one’s my favorite. I haven’t seen the second one, so I’m biased. Anyways, that’s not the point. This group plans to go along one path, you know, down these caves which are definitely in North Carolina, right Alan? That’s definitely where they take place.
AP: Well, they say it is in North Carolina, but it is quite clearly not filmed in North Carolina. As someone who is familiar with Western North Carolina, I had to laugh at this movie trying to pass off somewhere in the UK as like the Appalachian Mountains, but that’s just a quibble. So go ahead with your story.
CC: Anyways, yeah, they got a mountain in there, right? And then there’s a path into the mountain. Of course, they’re going to explore this deep, dark cave. So they’re descending as the name implies. And so they’re planning to go along one path. think someone maybe tricked someone else along the way. I can’t remember. But they’re planning on going down one path. And there’s a lot of things that begin to happen that they didn’t plan on. And one scene in particular, there’s a cave that collapses and of course that means they have to pivot, right.
AP: Yeah.
CC: So when you’re thinking about building an exit strategy and trying to plan for things that you can’t anticipate, how do you anticipate things you can’t anticipate?
AP: Well, first of all, let’s be clear. All the things that happened in that movie happened in a period of like two hours or an hour and a half. And part of the issue with any kind of process and operations is things can slowly start to go badly and you just kind of keep on trucking and really don’t pay attention to it. But…
CC: Yes.
AP: It’s not just about fine tuning your operations. That’s a whole other conversation. You your process is going to require updating every once in a while. There going to be new requirements and you need to address them in your content ops by changing your process, updating your tools, maybe adding something new. What we’re talking about here is when those tools and that process, they’re coming to an end, for example, because a particular piece of software is being defecated. It is end of life. What are you going to do?
CC: Mm-hmm.
AP: What if there is a merger? You have a merger and there are two systems doing the same thing. One of those systems is going to lose and go away. Why are you going to maintain two of the same systems? So you’re going to have to figure out how to pivot to get to that.
CC: Mm-hmm.
AP: So there are all of these things that can happen that mean you have got to exit whatever you were doing and move into something new, something different. And the reasons are many, like I just mentioned, but the end result is, are you ready for when that happens? In a lot of cases, frankly, people aren’t.
CC: Yeah. So if you could give listeners three pieces of advice on how to be less dependent on a particular system, if you had to narrow it down to three, what would you suggest to help them not be just dependent on one particular system or maybe a set of systems?
AP: One thing is when you’re choosing tools, if you end up something that is super proprietary, has its own file formats, et cetera, that means it’s probably gonna be harder to extract your content from that system because it is proprietary. Even if your content is in a standard, and in a lot of cases, of course, I’m talking about DITA, the Darwin Information Typing Architecture and XML standard. Even with DITA, even though it’s open source and a standard, some of the systems that can manage DITA content put their own proprietary layer on top. A good example of this is, for example, those of you with Samsung Android phones. I’ve had one in the past.
CC: Yeah, that’s me.
AP: Samsung puts their own proprietary layer on top of the Android operating system and a lot of that stuff frankly I hate, but that’s not the point of this conversation, but it’s the same issue. You have got this proprietary layer where it may even insert things into your source code that is very particular to that product line. So look at how proprietary your tool is or your toolchain is and how hard is it going to be to export? That should be an early question you ask during even the RFP process. How do people get out of your system? And I realize that sounds absolutely bat, you know what, to be telling people to be thinking about something like that when you’re just getting rolling–
CC: Appropriate for a cave analogy, right?
AP: Yes, true. But you should be, you absolutely should be.
CC: And how do you know you are going to get onto the other two things to think about in just a second, but question there, how do, what are some maybe green flags for how that question should be received or how you want that question to be received if it’s going to maybe be the right fit?
AP: I would hope some variation of the answer would be you can export to this standard, although that often is probably not the answer that you’re going to get.
CC: Okay, as standard. What are some other things people need to keep in mind in order to not be system-dependent?
AP: I don’t know if it’s so much system-dependent, but you need to think culturally about what this means. People become very attached to their tools because they become very adept. They become experts in how to manipulate and do whatever with a certain tool set. And they feel like, you know, I am in total control here. I know what I’m doing. Things are running well.
CC: Yeah.
AP: And when it turns out that tool is going to have to go away, their entire process and their focus on being an expert, it’s blown. It’s just blown away. And that can be very hard to deal with from a person level, a people level, having to tell people, yeah, this is a shock to your system. You’ve been using this tool forever. You’re really good at it. Unfortunately, that tool is being discontinued. We’re gonna have to move to something else. That can be very hard for people to swallow and it’s understandable.
CC: Mm-hmm.
AP: It’s completely understandable. One other thing that I will mention is if you can get your source content, not the actual delivery points I’m talking about here, but wherever you’re storing your source in some kind of format neutral, file format and again, talking mostly about XML content, extensible markup language, because when you create that content, you are not building in the formatting. You were creating it as a markup language. And the minute your content is in a markup language, it becomes easy to easier. I shouldn’t say easy because nothing here is easier. There is a better path to moving that content, possibly to another standard, for example, because you can set up a transformation process that’s very programmatic.
CC: Mm. Yeah.
AP: This particular element in this model becomes this. And when you hit this particular element in this model, you start a new file. If you see this particular attribute, it needs to be moved over here to this attribute.
CC: Hmm.
AP: So it’s a matching process that you have to do so it can be programmatic. So anytime you get into something that’s XML and what does that X stands for? And what does that X stand for? It stands for extensible. That gives you a little more control because it gives you more flexibility. And that’s weird to think more flexibility gives you more control. That almost seems kind of diametrically opposed, but that’s true.
CC: Yeah.
AP: Because you can move something out more easily because it is something that can be sliced, diced, transformed. So there’s that angle.
CC: Yeah. Yeah. So, okay. So as a non-technical person myself, I’m gonna see if I can summarize this and you tell me whether or not this is accurate. So from a very high level view of this, it’s almost like, you know, rather than keeping all of your content in one particular content management system or something like that, you’re keeping it in a, it’s all stored in a separate box or a separate repository. And then whatever system you’re going to use is your delivery output. It’s almost like a, is that accurate to say? Okay.
AP: Because when you are in a format that is not, doesn’t have the, if you’re in a file format that does not have the formatting of your content built in, that means you can deliver to a bunch of different presentation layers. You can automatically apply it.
CC: Okay.
AP: And that’s really, I was kind of headed that way. You can even see your new system as almost a delivery target, I need to figure out how to transform my source content in a way that a new tool, a new system can understand. And so basically you’re saying, okay, let’s export it, let’s clean it up, maybe do some automated transformations and programming on it to make it more ingestible by the other system.
CC: Mm-hmm.
AP: So you could even look at this process of moving from one system to another as being really your final destination, another horror movie, your final delivery target, moving that source content into another system that you’re about to use.
CC: Yeah. Thank you also for unpacking that because that was much more clear than my example, but that was really helpful. So since people are planning with the end in mind, how far out are we thinking this exit strategy would typically be implemented? How far down the road is this?
AP: And that’s the thing, I can’t answer that question because you never know what is going to happen. you, right, mean, it’s like the cave collapse analogy like you mentioned, sometimes you have to take a detour, not of your own choice or of your own making. And again, mergers, tools being discontinued, companies that go under, all of these things can happen. And you need to have a contingency.
CC: Mm. Never know. So it’s a contingency plan, really. Yeah.
AP: And you need to have a contingency plan in place to get ready to exit. It’s just like during natural disaster season, you hear people say, do you have your emergency preparedness kit ready? It’s a very similar thing, but it’s in the corporate world. This is as much about risk reduction as it is about smooth content operations, at least from my point of view.
CC: Yeah. Yeah. And you mentioned several like big things that happen that can trigger the need to, you know, it’s time to exit and move on. Are there any scenarios where there isn’t a big thing that happens like a merger or a business closing or different things like that? Are there more quiet ways where you realize you may not realize that it’s time to exit? But it’s more the need to exit is more subtle.
AP: If your content process, your content operations cannot support new business requirements, for example, you need to connect to a new system, you need to deliver your content in another format. If your current system and tools can’t do that, that is a sign you’re probably going to have to find the exit door and find something that will support whatever it is that you cannot do.
CC: Mm-hmm.
AP: It’s usually you just hit this wall where you realize we have taken this tool and this process as far as it can go. It is time to move on. And here I am going to toot the consultant horn again. But that is when you start getting that uneasy feeling, that’s when you can talk to a consultant who can help you unpack it to see if it’s really a sign that the tool is no longer going to fit you or if there’s something you can do within your current system to make things work. That’s when a third-party point of view can be very valuable.
CC: Question for you on that third party perspective, since you’ve seen companies make these transitions many times and exit something and go into a new one, what’s one thing or pitfall that companies need to be aware of that maybe isn’t included in their exit strategy that should be?
AP: Something that’s very common is to frame everything you want from your new system from the perspective of what your current system is doing. Even though your current system is not going to do something that you need it to do, you still are so fixated on how it is doing things and you can’t get beyond that. That can be a huge problem. Being able to step back and objectively look. This system can’t do this.
CC: Mmm.
AP: We need it to do that. And this is how we need to get there. People can get so mired in the, this is how we’re doing things. And we’re going to move over to this new system and do the same exact thing, just in new tools. That’s not a reason to move. There’s some compelling thing that’s forcing you out of that other tool. So now is the time to change things, update things, make some nips and tucks. Maybe undo some things. Don’t just wholesale move over into a new system and keep things status quo. Otherwise, why bother?
CC: Yeah, yeah. Is there anything else you can think of when you get to when it’s time to start the exiting process? Anything else that you can think of that companies need to have at the forefront of their mind?
AP: It’s the communication. And that includes the vendors and it includes with the people inside the company who are using the tools. And I would also mention it includes procurement. They need to understand the wins, the whys, why you’re having problems, all that, because there can be contractual obligations about when a license ends and another one begins. So you’ve got to keep that information flowing to all kinds of parties to make this exit, this transition work well.
CC: Yeah, you want it to end like the American version of The Descent where the hero actually gets out and drives away in the car, not like the UK version where the person is still stuck in the cave, which is the better ending for a horror movie, I will say, but not for your content ops project. Definitely.
AP: Yeah, but at least in a content ops project, you’re not going to get eaten by some humanoid blind thing living at a cave.
CC: Hopefully, right? That’s ideal. That’s the best case scenario.
AP: Hopefully not. Yeah.
CC: Well, Alan, is there any other parting advice you can think of before we wrap up today’s topic?
AP: Don’t go into a cave unprepared. Okay? Just don’t. How’s that?
CC: Yeah, don’t yeah that that is actually good advice. Yeah, don’t go unprepared. That’s really helpful. And like Alan mentioned earlier a third party perspective. I know it’s very biased to be saying it but a third party perspective when it’s time to either make the exit transition or plan for the exit transition. Content strategists can really help with that because we’ve seen we’ve seen a lot of things a lot of caves. Yes. Yeah.
AP: A lot. Maybe not cave dwellers, but a lot.
CC: Hopefully, hopefully no one has actually seen those. Yeah, well, thank you so much for being here, Alan. I really appreciate you talking about this with me today. And thank you for listening to the Content Strategy Experts Podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Survive the descent: planning your content ops exit strategy appeared first on Scriptorium.
Are you looking for real-world examples of enterprise content operations in action? Join Sarah O’Keefe and special guest Adam Newton, Senior Director of Globalization, Product Documentation, & Business Process Automation at NetApp for episode 175 of The Content Strategy Experts podcast. Hear insights from NetApp’s journey to enterprise-level publishing, lessons learned from leading-edge GenAI tool development, and more.
We have writers in our authoring environment who are not writers by nature or bias. They’re subject matter experts. And they’re in our system and generating content. That was about joining us in our environment, reap the benefits of multi-language output, reap the benefits of fast updates, reap the benefits of being able to deliver a web-like experience as opposed to a PDF. But what I think we’ve found now is that this is a data project. This generative AI assistant has changed my thinking about what my team does. Yes, on one level, we have a team of writers devoted to producing the docs. But in another way, you can look at it and say, well, we’re a data engine.
— Adam Newton
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Welcome to the content strategy experts podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage structure, organize and distribute content in an efficient way. In this episode, we talk about content operations with Adam Newton. Adam is the senior director of global content experience services at NetApp. Hi everyone, I’m Sarah O ‘Keefe. Adam, welcome.
Adam Newton: Hey there, how are you doing, Sarah?
SO: It’s good to see and/or hear you.
AN: Good to hear your voice.
SO: Yeah, Adam and I go way back, which you may discover as we go through this podcast. And as those of you that listen to the podcast know, we talk a lot about content ops. So what I wanted to do was bring somebody in that is doing content ops in the real world, as opposed to as a consultant.and ask you, Adam, about your perspective as the director of a pretty good-sized group that’s doing content and content operations and content strategy and all the rest of us. So tell us a little bit about NetApp and your role there.
AN: Sure. So NetApp is a Fortune 500 company. We have probably close to 11,000 or more global employees. Our business is primarily data infrastructure, storage management, both on-prem. We sell storage operating system called ONTAP. We sell hardware storage devices, and we are most importantly, think, at this day and age, integrating with Azure, Google Cloud Platform, and AWS on first -party hyperscaler partnerships. My team at DENAP is… I actually have three teams under me. The largest of those three teams is the technical publications team. The other two teams globalization responsible for localization translation of both collateral and product. And then finally, and most new to my team is our digital content science team, which is our data science wing. Have about 50 to 53, think, employees at this point in my organization and all told probably about a hundred with our vendor partners.
SO: And so I think we all have a decent idea of what the technical publications team and the globalization teams do. Can you talk a little bit about the data science side? What does that team up to?
AN: Yeah, that’s a thank you for asking that question. So about two years ago, I was faced with an opportunity to hire. And maybe some of your listeners who are managers are familiar with that situation, right? I hope they are, rather than not being able to hire. I took a moment and thought a little bit more about what I needed in the future. And I thought a little bit differently about roles and responsibilities, opportunities inside NetApp and the broader content world and decided to bring in a data scientist. And then I thought a little bit more about, well, there are other data scientists at NetApp. Why would I need one? And I thought a little bit about the typical profile of the data scientists at that time at NetApp, mostly in IT and other product teams. Those data scientists were primarily quantitative data scientists coming from computer science backgrounds. And I thought, well, you know, we’re in the content business. I want to find a data scientist who is a content specialist and who has a background in the humanities and who also has skills in core data science skills, emphasizing, for example, NLP. And so that was my quest. And I was very, very fortunate to find a PhD candidate in English who wanted to get out of the academy and who had these skills. And it’s been an incredible boon to our organization. We’ve even hired a second PhD in English recently. And Sarah, since you and I are friends, I’ll say one was from UNC and one was from Duke. Okay. So we don’t have to have that discussion here. I’m an equal opportunity person. Although I did hire the UNC one first, Sarah.
SO: I see, I see. So for those of you that don’t live in North Carolina, this is… I’m not sure there is a comparison, but it is important to have both on your team. And I appreciate your inclusion of everybody. It is kind of like… I’ve got nothing.
AN: Yes.
SO: Okay, so you hired some data scientists from a couple of good universities. Or do they get along? Do they talk to each other?
AN: Fabulously, yes. No petty grievances.
SO: Okay, just checking. All right. So how do you, in this context then, what does your environment look like? What kinds of things are you doing with the docs team? And what’s the news from NetApp docs?
AN: So maybe a little bit of background actually, and you and I have talked about this previously, but we used to be a data shop. And then as things sped up inside our business with the adoption and development of cloud services at NetApp, we found that some of the apparatus of our data infrastructure, our past practices weren’t able to keep up to speed of the cloud services that were being developed. I think this is actually, I’ve talked to other people in our business, this is a very common situation. We handled it in one way. There are many ways to handle it, but the way we chose to handle it was to exit data and to move in our source format anyway to a format called ASCII doc, which I always frequently describe as a dialect of markdown. And we went from being a closed system of technical writers working inside a closed CMS to adopting open source. We now work in GitHub. Our pipeline is all open source and we have now contributors to our content that are not technical writers. In some cases, they’re technical marketing engineers, solution architects, and so forth as well as a pipeline of docs that we build through automations where we, for example, transform API specifications or reference docs that are maintained by developers and output those into our own website docs.netapp.com. In addition to just the docs part, my globalization team has been using for many years, machine translation. So speaking to one particular opportunity of being in one organization, when we output our docs and whenever we update our docs in English, they’re automagically updated in eight other languages and published to docs.netapp.com. So we roughly maintain 150,000 English files and you can times those by eight. Is that right? Did I do the math right? Yeah.
SO: Or nine, depending.
AN: Nine. Yeah. Is English the language? Yeah, sure. Let’s count it.
SO: Depends on how we use it. Okay, so you have an ASCII doc, you know, Markdown-ish. Is it fair to call it Docs as Code environment?
AN: So we often describe it as a content ops, environment. I’m not sure if that is, different from Docs as Code, but I think maybe I will accept that as a reasonable description in the sense that, we have asked our team members to think about the content that they’re writing as highly structured, semantically meaningful units of information. I think in the same way I think a developer can be asked to think of their code being that way and the systems in which we write in VS code, many engineers are writing in that.
SO: Mm-hmm.
AN: And of course our source files, as I mentioned, all in our automation and our pipelines are all based on being in GitHub.
SO: And so then you’ve got docs.netapp.com as a portal or a platform where a lot of this content goes. And what’s happening over there? Do you have any news on new things you’ve done there?
AN: Yeah. I mean, very recently, you know, the timing of this is really interesting. We, have been working on a generative AI solution, for a year, Sarah. you’ll recall the, the hype, right? When, when chat GPT exploded onto the, the, into the public consciousness, right? Through the media and, shortly thereafter, we began imagining what it might look like to leverage that technology, those types of technologies to deliver a different customer experience. And we identified a chatbot as being something we thought could add to the browse and search experiences on docs .netapp .com. And we just released that on the 20th of August announced it here internally inside of NetApp on the 27th. So we are literally like 48, 72 hours into a public adventure here.
SO: I take full credit for planning it, even though I knew nothing about any of this.
AN: Yeah. And that was a long time. I think it’s worth noting too. It was a long time. And I think it’s beyond the full dimensions of this, this discussion to talk about why it took so long. But I will say maybe to, you know, the, were early adopters and we felt, we felt the pain and the benefit of being that, you know, it was like, you know, changing the tires on a, on a race car, right? That was speeding around the track. So we had to learn and be responsive and also humble in the sense that there were some missteps that we had to recover from and some magical thinking, I think, at the beginning of the project that was qualified more over the course of the project.
SO: And so what does that GenAI solution sitting in or over the top of the docs content set, what does that do in terms of your authoring process? Do you have any, are there any changes on the backend as you’re creating this content that is then consumed by the AI?
AN: I would say we’re in the process of understanding the full implications of having this new output surface, this generative AI assistant, and fully grappling with what the implications are for the writers. We find ourselves frequently in discussions about audience. And audience is all those humans that we have been writing for and a whole bunch of machines that we now need to think more consciously about, you know, and it’s, we find ourselves often talking about standards and style, but not just from the perspective of, you know, writing the docs in a consistently patterned way for humans to be able to consume well, but also because patterns and machines are a marriage made in heaven. And we see actually opportunities to begin to think of the content we’re writing as a data set that needs to be more highly patterned and predictable so that a machine can consume it and algorithmically and probabilistically decide how to generate content from the content we’re creating.
SO: And where is this going in terms of what’s next as you’re looking at this? I think you mentioned that there’s other opportunities potentially to add more data slash content.
AN: Yeah, actually, if I back up to a detail and I shared, but maybe quickly, you know, we do have writers in our authoring environment who are not writers. They are by nature and by bias sort of, they’re, people who have their subject matter experts, right? And they’re in our system and they’re generating content. But I think that some of the opportunities that, so that was about join us in our environment, right? Join us in our environment, reap the benefits of multi-language output, reap the benefits of fast updates, reap the benefits of being able to deliver a web-like experience as opposed to a PDF. But what I think we’ve found now is that this is a data project. This generative AI assistant has changed my thinking about what my team does. And I think, yes, on one level, true. Yes, we have a team of writers and there’s a big factory devoted to producing the docs. But in another way, you can look at it and say, well, we’re a data engine. We own a large, own, maintain a large data set and the GenAI is one consumer of that data set. But we’re also thinking about our data set as being joinable to other data sets inside of NetApp. And in particular, I work inside the chief design office at NetApp, along with UX researchers and designers. And we’re also more broadly part of our platform team at NetApp, shared platform team. So we’re thinking about how might we join our data with other teams’ data to create in-product experiences that are data-led or data-driven in combination with curated experience. So if your viewers were to be able to see me, I am waving my hand a little bit, not because I’m dissembling, but more because I’m aspiring. And I think there’s a really, really cool future ahead for, a way, Sarah, that I think is super energizing for the writers, right? To see that their work is being reframed, not replaced or changed, right? The fear of writers with GenAI, right, of being replaced. Well, I would offer this as an example of, you know, maybe it’s not such a dismal view and maybe in fact there’s a very interesting future if you reframe your thinking about what you do and the opportunities to join what you do to create different experiences.
SO: And I think it’s an interesting perspective to look at GenAI as being a consumer of the content slash data that you’re putting out. A lot of the initial stuff was, this is great. GenAI will just replace all the tech writers. You’re talking about something entirely different.
AN: I guess I wanted to expand on that because I think we’re actually now hovering on a really important point. You know, what is your mindset? You know, what what how are you thinking about this moment in time? The broad we write you or the broader you us generally write who are in this industry. And, you know, I think we don’t see a great indication that GenAI can create net new content and do it well, honestly. I think you can write it summarizing, it can make your day-to-day, your meeting notes and so forth, Microsoft Co-pilot, right? There are some great uses, but I have not seen convincing, compelling indicators that docs can be written by, at least at the enterprise level, right? Our products are complex. We often talk about our writers as sense makers, right? And I think that we can take advantage of GenAI in the right ways. And I think this is one of the ways that we’re taking advantage of it, which is to give customers another experience. And frankly, also for us to learn a lot about what people are asking and assuming and we can learn a lot and continuously improve.
SO: So what’s happening on the delivery side? Somebody asks for some sort of information and it gives either, it says it doesn’t exist or it gives an incorrect response. Are you seeing any patterns there? What are you doing with that?
AN: Yeah, many of your listeners might have produced products themselves, right, or delivered products themselves and remembered what happens in the first day or two of releasing a product, right? So the timing of this chat is really good. Yeah, in the last couple days we’ve seen I was just talking to a data scientist on my team and I was saying, you know, what I think I see here emerging as a possible pattern is that people don’t actually know how to use these things effectively. That, you know, they ask of it questions that it really could never answer, or they don’t fully understand the constraints of the system, meaning that, well, it’s only based on a certain data set. you know, they don’t know that the data set doesn’t include the data they’re looking for, right? Because it sits somewhere else. You know, we’re modifying our processes to intake feedback. I think there’s a real interesting nexus is, is it the AI or is it the content? That’s the really interesting one, right? You know, was the content ambiguous, deficient, duplicitous, whatever, you know, is that a word?
SO: It is now.
AN: At UNC we use that word, not at Duke. But it is an interesting discussion inside our organization when we receive a piece of feedback, what’s causing it? Is it the interpretive engine or is it our source? And so we’re seeing a lot of gaps in our content, it’s exposing a lot of gaps or other suboptimal implementations.
SO: I mean, we’ve said that in a sort of glib manner, because of course you’re living this day to day and hour by hour, but we’ve said that, know, GenAI sitting over the top of a content set is going to uncover all your inconsistencies, all your missing pieces, all your, you know, over here you said update and over here you said upgrade. That was an example I heard from someone else. And so it basically uncovers your technical debt.
AN: Yeah, beautiful. Yeah, bingo. Yeah. Yeah. Yeah. You’re so right there. Terminology, right? my God. Can you believe how many things, how many ways we’ve talked to, talked about X, right?
SO: Right, and the GenAI thinks they’re different because, or it doesn’t think anything right, but the pattern isn’t there and so it doesn’t associate those things necessarily.
AN: Yeah, your listeners may commiserate with this, or the use of words as verbs and nouns, like cable. We often in our documentation talk about cabling devices. How would a GenAI know that the writer of the question is using cable as a verb or noun?
SO: Mm-hmm. So as you’re working through this and with your, you know, it sounds like two days of go live plus a year or two or three of suffering and a year and two days.
AN: Well, a year and two days, a year and two days.
SO: You know, I think you’re further along than lot of other organizations. Do you have any advice for those that are just beginning this journey and just looking at these kinds of issues? What are the things you did best or maybe worst or would do the same way or not? What’s out there that you can tell people that’ll maybe keep them from, you know, get them, get them or help them as they move forward?
AN: Yeah, but maybe think of it in the old people process systems dimensions. Actually, taking that latter one, systems, I would say beware the fascination of the system without thinking more about the processes and people that are going to be involved in the creation of some kind of generative AI solution. I think, you know, this is as much of an adaptive people process as it is a problem as it is a technical problem. Probably more frankly on the adaptive. And from a process perspective, I’d say, be curious about what you learn. Be attentive to the specifics, but look for the broad patterns in the feedback or what you’re seeing as you develop these solutions, you know, for me, I think I hinted at this before and I think it for me has been frankly, the epiphany of the project. There have been many, but I’d say I I would really highlight this one, which is what does my team do? What is the value of what they generate? And for me, yes, we are, you know, primarily a team that creates documentation, but you know, holy smokes, you know, the, the idea that we are data owners, and we govern a massive, semantically rich, non-determinant, fast-changing data set, that is super, super interesting. Even here inside NetApp, Sarah, we have teams reaching out to us who frankly before probably never thought about the docs. And all of a sudden, because we have this huge data set, they’re like, wow, we can, you know stress test our system or our new technologies using what they have. That’s a super cool moment for our team.
SO: Yeah, I think you’re the first person that I’ve heard describe this sort of context shift from this is content to this is data or this content is also data or however you want to phrase that. But I think that’s a really interesting point and opens up a lot of fascinating possibilities, not least for the English PhDs of the world. That’s super helpful.
AN: Is this where I confessed at one time trying to think I was going to be one of those and I got out because I realized I was terrible at it?
SO: No, no, no, that goes in the non-recorded part of the podcast. Yeah, I’m going to wrap it up there before Adam spills all of the dirt.
AN: Yeah, what am I compensating for, right?
SO: But thank you, because this is really, really interesting. And I think it will be helpful to the people listening to this podcast, because it’s so rare to get that inside view of what it really looks like and what’s really going on inside some of these bigger organizations as you move towards AI, GenAI strategies and figure out how best to leverage that. So thank you, Adam. And it’s great to see you.
AN: No, Sarah, thank you. And actually, I would like to thank my team. I mean, it has been an incredible adventure, and I think the team is really amazing.
SO: Yeah, and I know a few of them and they are great. So with that, thank you for listening to the Content Strategy Experts Podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Enterprise content operations in action at NetApp (podcast) appeared first on Scriptorium.
In episode 174 of The Content Strategy Experts podcast, Sarah O’Keefe and Alan Pringle explore the mindset shifts that are needed to elevate your organization’s content operations to the enterprise level.
If you’re in a desktop tool and everything’s working and you’re happy and you’re delivering what you’re supposed to deliver and basically it ain’t broken, then don’t fix it. You are done. What we’re talking about here is, okay, for those of you that are not in a good place, you need to level up. You need to move into structured content. You need to have a content ops organization that’s going to support that. What’s your next step to deliver at the enterprise level?
— Sarah O’Keefe
Related links:
LinkedIn:
Transcript:
Alan Pringle: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about setting up your content operations for success. Hey everyone, I am Alan Pringle and I am back here with Sarah O ‘Keefe in yet another podcast episode today. Hello, Sarah.
Sarah O’Keefe: Hey there.
AP: Sarah and I have been chatting about this issue. It’s kind of been this nebulous thing floating around and we’re gonna try to nail it down a little bit more in this conversation today. This idea of setting up your organization for success and their content operations. And to start the conversation, let’s just put it out there. Let’s define content ops. What are content operations, Sarah?
SO: Content strategy is the plan. What are we going to do, how do we want to approach it? Content ops is the system that puts all of that in place. And the reason that content ops these days is a big topic of conversation is because content ops in sort of a desktop world is, well, we’re going to buy this tool, and then we’re going to build some templates, and then we’re going to use them consistently. And the end, right? That’s pretty straightforward. But content operations in a modern content production environment means that we’re talking about lot of different kinds of automation and integration. So the tools are getting bigger, they’re scarier, they’re more enterprise level as opposed to a little desktop thing. And configuring a component content management system, connecting it to your web CMS and feeding the content that you’re generating in your CCMS, your component content management system, into other systems via some sort of an API is a whole different kettle of fish than dealing with, you know, your basic old school unstructured authoring tool. So yeah.
AP: Right. But in their defense, for the people who are using desktop publishing, that is still content operations.
SO: Sure, it is.
AP: It’s just a different flavor of content operations. And frankly, a lot of people, a lot of companies and organizations outgrow it, which is why they’re going to this next level that you’re talking about.
SO: Right. So if you’re in a desktop tool and everything’s working and you’re happy and you’re delivering what you’re supposed to deliver and basically it ain’t broken, then don’t fix it. You are done. You should shut off this podcast and go do something more fun with your time. Right? What we’re talking about here is, okay, for those of you that are not in a good place, you need to level up. You need to move into structured content. You need to have a content ops organization that’s going to support that. What do you do? What’s your, you know, what’s your next step and what does it look like to organize this project in such a way that you move into, you know, that next level up and you can deliver all the things that you’re required to deliver in the bigger enterprise, whatever you want to call that level of things. So desktop people, I’m slightly jealous of you because it’s all working and you’re in great shape and good for you. I’m happy for you.
AP: So making this shift from content operations and desktop publishing to something more enterprise level like you’re talking about, that is a huge mind shift. is also technically something that can be quite the shock to the system. How do you go about making that leap?
SO: Well, I’m reminded of a safety announcement I heard on a plane one time where they were talking about how, you know, when you open the overhead bins after landing, you want to be careful. And the flight attendant said, shift happens. And we all just looked at her like, did you actually just say that? And she sort of smirked. So making this shift can be, it’s can be, it’s difficult, right? And what we’re usually looking at is, okay, you’ve been using, you know, Word for the past 10, 15, 20, 57 years. And now we need to move out of that into, you know, something structured XML, maybe it’s DITA, and then get that all up and running. And so what’s going to happen is that you have to think pretty carefully about what does it look like to build the system and what does it look like to sustain it? Now here I’m talking particularly to large companies because what we find is the outcome in the end, right, when this is all said and done and everything’s up and running and working, what you’re probably going to have is some sort of an organization that’s responsible for sustainment of your content ops. So you’re to have a content ops group of some sort, and they’re going to do things like run the CCMS and build new publishing pipelines and keep the integrations moving and help train the authors. And in some cases, they’re kind of a services organization in the sense that you have an extended group of maybe hundreds of authors who are never going to move into structured content. So you’re taking on the, again, word content that they are producing, but you’re moving it into the structured content system as a service, like an ingestion or migration service to your larger staff or employee population. Okay, so in the future world, you have this group that knows all the things and knows how to keep everything running and knows how to kind of manage that and maintain it and do that work. And probably in there, you have an information architect who’s thinking about how to organize content, how to classify and label things, how to make sure the semantics, you know, the actual element tags are good and all that stuff. But right now, you’re sitting in desktop authoring land with a bunch of people that are really good at using whatever your desktop authoring tool may be. And you have to sort of cross that chasm over to, now we’re this content ops organization with structured content, probably a component content management system. So what I would probably look at here is, you know, what is the outcome? You know, thinking about the system has stood up, we’ve made our tool selection, everything’s working, everything’s configured, everything’s great. What does it look like to have an organization that’s responsible for sustaining that? And that could be, you know, two or three or 10 people, depending on the size, again, the size and scope of your organization and the content that you’re supporting. But in order to get there, you first have to get it all set up. You have to do the work to get it all up and running. Our job typically is that we get brought in to make that transition. Right? So we’re not going to be for a large organization, we’re not going to be your permanent content ops organization. We might provide some support on the side, but you’re going to have people in-house that are going to do that. They’re going to be presumably full-time permanent kind of staff members. They know your content and your domain and they have expertise in, you know, whatever your industry may be.
AP: Right.
SO: Our job is to get you there as fast as possible. So we get brought in to do that setting up piece, right? What are the best systems? What are the things you need to be evaluating? What are the weird requirements that you have that other organizations don’t have that are going to affect your decisions around systems and for that matter, people, right? Are you regulated? What is the risk level of this content? How many languages are you translating into? What kind of deliverables do you have? What kind of integration requirements do you have? And when I say integration, to be more specific, maybe you’re an industrial company and so you have tasks, service, maintenance kinds of things, and you need those tasks like how to replace a battery or how to swap out breaks to be in your service management system so that a field service tech can look at their assignments for the day, which are, you know, go here and do this repair and go here and do this maintenance. And then it gets connected to, and here’s the task you need and here’s the list of tools you need. And here are all the pieces and parts you need in order to do that job correctly. Diagnostic troubleshooting systems. You might have a chat bot and you want to feed all your content into the chat bot so that it can interact with customers. You may have a tech support organization that needs all this content and they want it in their system and not in whatever system you’re delivering. So we get into all these questions around where does this content go? You know, where does it have tentacles into your organization and what other things do we need to connect it to and how are we going to do that? So I think it’s very helpful to look at the upfront effort of configure or, you know, making decisions, deciding on designing your system and setting up your system versus sustaining, enabling, and supporting the system.
AP: There are lots of layers that you just talked about and lots of steps. It is very unusual, at least in my experience, to find someone, some kind of personnel resource, either within or hiring, who is going to have all of the things that you just mentioned because it is a lot to expect one person to have all of that knowledge, especially if you are moving to a new system, and you’ve got a situation where the current people are well versed in what is happening right now in that infrastructure, that ecosystem. To expect them to magically shift their brain and figure out new things, that’s a lot to ask for. And I think that’s where having this third-party consultant person, voice, is very helpful because we can help you narrow in on the things that are better fits for what you’ve got going on now and what you anticipate coming in the future.
SO: Yeah, I mean, the thing is that what you want from your internal organization is the sustainability. But in order to get there, you have to actually build the system, right? And nearly always when people reach out to us and say, we’re making this transition, we’re interested, we’re thinking about it, et cetera, they’re doing it because they have a serious problem of some sort. We are going into Europe and we have no localization capabilities or we have them, but we’ve been doing, you know, a little bit of French for Canada and a tiny bit of Spanish for Mexico. And now we’re being told about all these languages that we have to support for the European Union. And we can’t possibly scale our, you know, 2 .5 languages up to 28. It just, it just can’t be done. We’ll, we’ll drown. Or people say, We have all these new requirements and we can’t get there. We’ve been told to take our content that’s locked into, you know, page based PDF, whatever, and we’re being required to deliver it, not just onto the website and not just into HTML, as you know, content as a service, as an API deliverable, as micro content, all this stuff. And they just, they just can’t, you can’t get there from here. And so you have people on the inside who understand, as you said, the current system really well, and understand the needs of the organization in the sense of these things that they’re being asked to do and they understand the domain. They understand their particular product set internally. But it’s just completely unreasonable to ask them to stand up, support and sustain a new system with new technology while still delivering the existing content because, you know, that doesn’t go away. You can’t just push the pause button for five months.
AP: No, the real world does not stop when you are going on some kind of huge digital transformation project like one of these content ops projects. So basically what we’re talking about here, especially on the front end, the planning discovery side, is we can help augment, help you focus. And then once you kind of picked your tools and you start setting things up, there’s some choices there that sometimes have to do with like the size of an organization about how to proceed with implementation and then maintenance beyond that. Let’s focus on that a little bit.
SO: Most of the organizations we deal with are quite large. Actually, all of the organizations we deal with are quite large compared to us, right? It’s just a matter of are they a lot bigger or are they a lot, a lot, a lot, lot bigger?
AP: Correct.
SO: Within that, the question becomes how much help do you want from us and how much help do your people need in order to level up and get to the point where they can be self-sufficient? We have a lot of projects we do where we come in and we help with that sort of big hump of work, that big implementation push, and help get it done. And then once you go into sustainment or maintenance mode, it’s 10% of the effort or something like that. And so either you staff that internally as you’re building out your organization internally, or we stick around in sort of a fractional, smaller role to help with that. The pendulum kind of shifted on this for a while, or way back, way back when it was get in, do the work and get out. We rarely had ongoing maintenance support. Then for a bit, we were doing a lot of maintenance relative to the prior efforts. And now it feels as though we’re seeing a shift in a little bit of a shift back to doing this internally. Organizations that are big enough to have staff like a content ops group or a content ops person are bringing it back in-house instead of offloading it onto somebody like us. We’re happy to do whatever makes the most sense for the organization. At a certain size, my advice is always to bring this in-house because ultimately, your long-term staff member who has domain expertise on your products and your world and your corporate culture and has social capital within your organization will be more effective than offloading it onto an external organization, no matter how great we are.
AP: To wrap up, think I want to touch on one last thing here, and that’s change management. And yes, we beat that drum all the time in these conversations on this podcast, but I don’t think we can overstate how important it is to keep those communication channels open and be sure everyone understands what’s going on and why you’re doing what you’re doing. What we’ve talked about so far is very much, okay, we’ve come up with a technical plan, we’ve done a technical implementation, and now we’re going to set it up for success and maintain it for the long haul and adjust it as we need to as things change. But there are still a group of people who have to use those tools, your content creators, your reviewers, all of those people, your subject matter experts, I mean, I can go on and on here, they are still part of this equation here and we can’t forget about them while we’re so focused on the technical aspects of things.
SO: I would say this and directly to the people that are doing the work, know, the authors, the subject matter experts, the people operating within the system. I would look at this as an opportunity. It is an opportunity for you to pick up a whole bunch of new skills, new tools, new technologies, new ways of working. And while I know it’s going to be uncomfortable and difficult and occasionally very annoying as you discover that the new tools do some things really well, but the things that were easy in the old tools are now difficult, right? There’s just going to be that thing where the expertise you had in old tool A is no longer relevant and you have to sort of learn everything all over again, which is super, super annoying. But it’s fodder for your resume, right? I mean, if it comes to it, you’re going to have better skills and you’re going to have another set of tools and you’re going to be able to say, yes, I do know how to do that. So I think that just from a self-preservation point of view, it makes a whole lot of sense to get involved in some of these projects and move them forward because it’s going to help you in the long run, whether you stay at that organization or whether you move on to somewhere else, you know, at some point in the future. That’s one of the ways I would look at this. It is certainly true that the change falls on the authors, right?
AP: Correct.
SO: They all have to change how they work and learn new ways of working and there’s a lot there and I don’t want to you know sort of sweep that aside because it can be very painful. We try to advocate for making sure that authors have time to learn the new thing that people acknowledge that they’re not going to be as productive day one in the new system as they were in the old system that they know inside out and upside down that they get training and knowledge transfer and just, you a little bit of space to take on this new thing and understand it and get to a point where they use it well. So I think there’s a, you know, there’s a combination of things there. For those of you that are leading these projects, it is not reasonable, again, to stand the thing up and say, go live is Monday. So, you know, I expect deliverables on Tuesday. That is not okay.
AP: Yeah. And you’ve just wasted a ton of money and effort because you’ve thrown a tool at people who don’t know how to use it. So all of your beautiful setup kind of goes to waste. So there are lot of options here as far as making sure that your content ops do succeed. And I don’t think it’s like pretty much everything else in consulting land. It is not one size fits all.
SO: It depends, as always. We should just generate one podcast and put different titles on it and just say it depends over and over again.
AP: Pretty much, we’d probably just get an MP3 of us saying that phrase over and over again and just loop it and that will be a podcast episode. And on that not-great suggestion for our next episode, I’m gonna wrap this up. So thank you, Sarah.
SO: Thank you.
AP: I think she just choked on her tea, everyone.
SO: I did.
AP: Thank you for listening to the Content Strategy Experts Podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Position enterprise content operations for success (podcast) appeared first on Scriptorium.
Translation troubles? This podcast is for you! In episode 173 of The Content Strategy Experts podcast, Bill Swallow and special guest Mike McDermott, Director of Language Services at MadTranslations, share strategies for overcoming common content localization challenges and unlocking new market opportunities.
Mike McDermott: It gets very cumbersome to continually do these manual steps to get to a translation update. Once the authoring is done, ideally you just send it right through translation and the process starts.
Bill Swallow: So from an agile point of view, I am assuming that you’re talking about not necessarily translating an entire publication from page one to page 300, but you’re saying as soon as a particular chunk of content is done and “blessed,” let’s say, by reviewers in the native language, then it can immediately go off to translation even if other portions are still in progress.
Mike McDermott: Exactly. That’s what working in this semantic content and these types of environments will do for a content creator. You don’t need to wait for the final piece of content to be finalized to get things into translation.
Related links:
LinkedIn:
Transcript:
Bill Swallow: Welcome to the Content Strategy Experts podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we explore strategies for conquering localization challenges, and unlocking new market opportunities. Hi everybody. I’m Bill Swallow, and with me today is Mike McDermott from MadCap Software. Hey Mike.
Mike McDermott: Hi Bill.
BS: So before we jump in, Mike, would you like to provide a little background information about you, who you are, what you do at MadCap?
MM: Sure. My name is Mike McDermott. I am the director of language services at MadCap Software working with our MadTranslation Group. And we support companies that work in single source authoring in multichannel publishing tools like those offered from MadCap Software for IXIA and MadCap Flare and Xyleme and other tools.
BS: So Mike, what are some of the challenges you’ve seen and what works for overcoming some of these localization challenges?
MM: One of the main challenges I see with companies that come to us, and they typically come to us because they’re looking at working in an XML-based authoring tool and they’re curious about the advantages it has for translation. And one of the biggest challenges I see initially with these companies is just figuring out what content needs to go into translation when you’re working in different types of tools. And one of the ways I see to solve that problem is working in a tool where you have the ability to tag certain content and identify content for different audiences or different purposes. It just makes it simpler to identify that content and get it straight into translation and removes a lot of the human error around packaging up content and trying to figure out yourself what files, house texts that might be translatable for whatever the output is that you’re looking to build. So just working in those tools I see inherently helps with translation because it helps you identify exactly what needs to be translated and it gets it into translation much quicker.
BS: So I think we’re talking about semantic content there and making sure that you have all the right metadata in place so that you can identify the correct audience, the correct, let’s say versions of the product, whether to translate or not, and any other relevant information about the content. So you’re able to isolate the very specific bits of content that need to be translated and omit a lot of the content that necessarily isn’t needed for that deliverable.
MM: Exactly, Bill. It lets the technology tell you what needs to be translated in what houses text versus you trying to go through a file list and determine what do I need to send out to a translator to translate. The flip side of that is to just send everything for translation, but it’s very rare that anything in any given project for any type of system is going to need to be translated. So by tagging it in that way, you can quickly get into the translation and get things moving. And what I see happening at the end of these projects, oftentimes when you’re not working in those types of systems is you end up finding bits and pieces of content or different files that ended up needing to be translated that missed that initial pass. Now they have to go back through translation and you’re delayed. So just getting everything right the first time and relying on the tools to tell you exactly what needs to be translated by looking up metadata or different tags just simplifies the process and speeds everything up, helps translation get done quicker and just improves time to market for the end user to get their content out.
BS: So it sounds like it reduces a good amount of friction, especially with regard to finding missing bits and pieces that should have been translated and weren’t, and then needing to go back and make sure that’s done in time. What are some other ways that people can reduce friction in their translation workflow?
MM: Well, a big emphasis for us over the past few years around removing friction is working with connectors and different technologies that can orchestrate the translation process. So we can automate a lot of this and remove the bottlenecks around someone having to, like I said before, manually go into a set of files and package things up for a translator, zip up files, upload them to different locations, and they just get passed around and things can happen when working that way, even outside of just missing files. So working with connectors and these technologies that can connect directly into these systems and get the text right into translation, removing all those friction points just eliminates a lot of room for error in project delays, bottlenecks for tasks that can be easily handled by modern technology.
BS: And I assume that there’s probably some technology there as well that kind of govern other things, other parts of the workflow, like review, content validation, that type of thing?
MM: Exactly, exactly. So we’re trying to automate the flow of data into the different points in translation and then get the content ready. For example, for reviewers, you mentioned reviewers. So once content gets into translation, we can get it right into the translation system from the authoring environment that the customer’s working in, get it into translation. And as soon as the translation is done, a human reviewer on the client side or on our side or whoever can be notified that this content is ready for translation and it just helps keep things moving. So now it’s on them to complete their translation. And once that’s done, the process can continue on and the automated QA checks, the human QA checks can be done at that point, and then the project can be pushed back to wherever it needs to go and put into publication. But by automating the steps and plugging in the humans where they provide the most value, it just removes the time costs in error-prone steps that don’t need to be there.
BS: So it sounds like a lot of it does come down to saving a good deal of time. I would also imagine that these types of workflows, they also help streamline a lot of the publishing needs that come after the translation as well.
MM: Correct. And that’s kind of why we started MadTranslation when we did, was to provide our customers a place to go to work with the translation agency that understood these tools and understand how these bits and pieces come together to build an output. We put it together to provide our customers a turnkey solution where they can get a working project back where they can quickly get into publication. By removing the friction points and using modern technology to automate a lot of these processes, we’re able to get things into translation and add a translation into the final deliverable much faster. So once that happens, we can build the outputs and we can check if it requires a human check on it, things can get to that point much quicker, and we’re not waiting for somebody to manually pull down files and putting them into another location so the next actually take place. We want to automate that part of it so we can get to that final output into a project file where a customer can plug it into their publishing environment and get it out as quickly as possible. A lot of the wasted time is around those manual steps, and when it comes to validation and review, it’s just the reviewers and validators maybe not being ready for the validation or not being educated on how it will work. So it’s important to make sure that everyone in that process knows how it’s going to be done, when things are going to be ready for the review or the QA checks. And then the idea from there is to just feed the content in via connectors, removing the friction point and just send it through. And this is necessary, especially when you’re doing very frequent updates and kind of a more of an agile translation workflow. It gets very cumbersome to continually do these manual steps to get to a translation update. Once the authoring is done, ideally you just send it right through translation and the process starts.
BS: So from an agile point of view, I am assuming then that you’re talking about not necessarily translating an entire publication from page one to page 300, but you’re talking about as soon as a particular chunk of content is done and it’s “blessed,” let’s say, by reviewers in the native language, then it can immediately go off to translation even if other portions are still in progress.
MM: Exactly. Exactly. And that’s what working in this semantic content and these types of environments will do for a content creator is you don’t need to wait for the final piece of content to be finalized to get things into translation. So as you said, it becomes even more important when you’re doing updates because you don’t want to have to send over the entire file set every time you’re doing an update. Whereas when you’re working in a more linear format like Word, you end up having to send that full file every time, and the translation agency is likely reprocessing it using translation memory. But all that stuff still takes time and working in these types of tools, you can very quickly identify those new parts or those bits that you know are ready for translation, tag them or mark them in some way and send them through the translation process.
BS: Very cool. So a lot of the work that we’re seeing now on the Scriptorium side of things is in re-platforming. So people have content in an old system or they have, say a directory full of decaying word files, and they want to bring it into some other new system. They want to modernize, they want to centralize everything, basically have a situation where they’re working in data or some other structured content, bring it into semantic content. What are some of, I guess, the benefits of doing that give you as far as translation goes when you’re looking at content portability? So being able to jump ship from one system to another.
MM: I think working in those systems where the text or the content is stored away from the output that you’re building has a lot of benefits to not only translation being able to just get the text that needs to be translated, exported out of the system and then put back where it needs to go. But it really future-proofs you and gives you the portability that you talk about to make changes because the text is stored in a standard format that can be ported versus you see some organizations getting locked into a closed environment to where when it goes to make a change, it requires certain types of exports to other type of file types that other tools can then import. But by storing them in a standard way in XML, for example, it gives you that flexibility in a future proves you from being locked into any one scenario.
BS: Excellent. So I have to ask, since I’ve come from a localization background as well, what’s one of the hairier projects that you’ve seen or one of the hairier problems that people can run into and in a localization workflow?
MM: One of the challenges we run into sometimes around client review, when you start incorporating validators into the translation system and include them as part of the process, when you get multiple reviewers. Sometimes that will happen where a company will assign a reviewer for every language, but you might have different people reviewing the same set of content. I mean, that’s the biggest delay that we see with projects is translations delivered and then the translation is dumped on a native speaker within the company’s desk and they’re asked to review it and they’re not ready to do the review, it’s not scheduled and it can delay the project. That’s one of the biggest delays we see. So that’s why we try at the front end of a project to figure out on the client side, what’s going to happen after we deliver this project, after we send the files, is the content going to be reviewed or validated? If so, let’s figure out a way to incorporate them into our translation system where they can review the translations before we build the outputs and do all the QA checks. So that’s one of the hairier situations in terms of time delays. Expectations around just time in general have always been a thing in localization. As you know, people can be surprised as to how long it can take for a translator to get through content. I mean, the technology is there certainly to speed it up. Since we’ve started MadTranslations a little over 10 years ago, we’ve seen the translation speed increase quite a bit, but it still takes time for a good translator to get through that content and know when to stop and do the research that’s needed to get a technical term right. So that’s one of the surprise moments I think for new buyers of localization is the time that it can take and there’s solutions in place, like I said, to make it go faster. But if you want that human review and that expertise and the cognitive ability to know when to stop and figure out what this term is or what the client wants or doesn’t want around certain terminology, and then to database it and then include that as part of the translation asset so it stays consistent every time. That takes time versus just sending something through a machine translation, doing a quick spot check and sending it back to the customer.
BS: So it sounds like having that workflow defined and setting those expectations that certain things need to happen at each point of that workflow. Some of it might be automated, some of it does require a person, and that person I guess should probably be identified ahead of time and given a heads-up that, “Hey, something’s going to be coming at you in three weeks. Be ready for it.”
MM: Be ready for it. And also, what are you ready for? So it’s kind of training a reviewer, what are you looking for here? Are we looking for key terms? Are we looking for style preferences? Everyone kind of understanding what it is that a reviewer is going to be looking for, and they might be looking for different things when it comes to technical documentation versus a website, for example. So just having everyone communicate and understand what the intended purpose of the final output is and where everyone fits in the process and defining a schedule around that process definitely helps.
BS: Definitely. I know myself, I’ve seen cases where working for a translation agency, having a client come to me and basically say, “I need this done as soon as possible. What can you do?” And it was a highly technical manual, and we said, “Well, we have an expert in these different languages. This person is available now. This one won’t be available until next month. And this person really only works nights and weekends because they are a professional engineer in their day job.” So turnaround is going to be a little slow, and the client persisted that we just need it as soon as possible. We need to get it out the door in a couple of weeks, and I’m thinking to myself in the back of my head, why are you coming to us now when you need this in a couple of weeks? You shouldn’t just be throwing it over the fence at the last possible minute and expecting it to come back tomorrow. So there was that education. Unfortunately, they decided that they didn’t care. They wanted us to use as many translators as possible and get it done as quick as possible. And we had them sign documents that basically said that we are not liable for the quality of the translation since the client is basically looking to get this done as quickly and cheaply and dirty as possible. It was a nightmare, and I think it took one round of review on the client side for them to basically circle back and say, “Okay, I get what you were saying now.” None of these translations work at all together, because we were literally sending out a chapter to a different translator and there was no style guide because the client hadn’t provided anything. There was no terminology set because the client didn’t provide anything and everything came back different. And they said, “Okay, we get it. We get it. We’ll revise our schedules, get it done the right way. I don’t care how long it takes.”
MM: I’ve run into something very, very similar to what you described, and it was put disclaimers in the documents to where this is going to be poor quality. We’re admitting it right now. This is the only way we’re going to get it back within a week, and we do not recommend publishing. And as soon as the files come back and so on, looks at it and says, “Okay, let’s back up and do it the right way.”
BS: Yes. I guess the biggest takeaway there is plan ahead and plan for quality and not just try to get it done as fast as possible.
MM: And that’s one of the benefits to where we sit at MadTranslations with MadCap Software companies, companies coming into these types of environments. They’re typically at the front end, the planning stages on trying to figure out how all this is going to work. So we have an ability to help them understand what the process looks like and then define it in combination with our tooling and their needs and come up with a workflow that’s going to keep things moving fast, but gives you that human level quality that everyone needs at the end.
BS: Being able to size up exactly what the process needs to look like before you’re in the thick of it definitely helps. And having that opportunity to coach someone through setting up the process for the first time, I’d say that’s definitely priceless because so many mistakes can happen out of the gate between how people are authoring content, what their workflow looks like.
MM: And it’s even more important for companies to have to maintain the content. So it’s one thing to just take a PDF and say, “Hey, I need to translate this file and I’m never going to have to update it again. I just need a quick translation.” It’s another to have a team of authors dispersed around the globe working on the same set of content that then needs to be translated continuously.
So different needs, but like you said, planning, defining the steps and knowing what the requirements of the content are from authoring to time to publication in each language, and how to fit the steps and to meet that as best as possible is best done, like you said, upfront versus when it needs to be published in a week.
BS: Planning, planning, planning. I think that sounds like a good place to leave it. Mike, thank you very much.
MM: Thank you, Bill. Thanks for having me on.
BS: Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Conquering content localization: strategies for success (podcast) appeared first on Scriptorium.
When organizations replatform from one content management system to another, unchecked technical debt can weigh down the new system. In contrast, strategic replatforming can be a tool for reducing technical debt. In episode 172 of The Content Strategy Experts podcast, Sarah O’Keefe and Bill Swallow share how to set your replatforming project up for success.
Here’s the real question I think you have to ask before replatforming—is the platform actually the problem? Is it legitimately broken? As Bill said, has it evolved away from the business requirements to a point where it no longer meet your needs? Or there are some other questions to ask, such as, what are your processes around that platform? Do you have weird, annoying, and inefficient processes?
— Sarah O’Keefe
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about replatforming and its relationship to technical debt. Hi, everyone. I’m Sarah O ‘Keefe. And the two of us rarely do podcasts together for reasons that will become apparent as we get into this.
Bill Swallow: And I’m Bill Swallow.
SO: What we wanted to talk about today was some more discussion of technical debt, but this time with a focus on a question of whether you can use replatforming and new software systems to get rid of technical debt. I think we start there with the understanding that no platform is actually perfect.
BS: Mm-hmm.
SO: Sorry, vendors. It’s about finding the best fit for your organization’s requirements and then those requirements change over time. Now Bill, a lot of times when we talk about replatforming, you hear people referring to the burning platform problem. So what’s that?
BS: Yeah, it’s well, it may actually be on fire, but likely not. What we’re really talking about is, you know, a platform that was chosen many years ago. Perhaps it’s approaching end-of-life. Perhaps your business needs have taken a, you know, a left or sharp left or right turn and it no longer, you know, the platform no longer supports those business needs or, you know, it really could be just a matter of cost. You know, the, platform you bought 10 years ago was, was built upon a very specific cost structure and model. And you know, the world is different now, and there are different pricing schemes and whatnot. And you may just want to, you know, replatform to recoup some of that cost.
SO: So does that, I mean, does that work? mean, if you exit platform A and move on to platform B, are you necessarily gonna save money? So no.
BS: In a perfect world, yes, but we don’t live in a perfect world. Yeah. I mean, I hate to be the bearer of bad news, you know, if you’re looking to switch from one, you know, from one platform to another to save costs, there is a cost in making that switch. And, you know, at that point, you need to look at, weighing the benefits and drawbacks, you know, is the cost to move to a new system going to be worth the cheaper solution in the long run. I mean, it’s a very, very basic model to look at. And there’s a lot of other costs and benefits and drawbacks to making a replatforming platform switch. But it’s one thing to consider there.
SO: Yeah, I think additionally, it’s really common to have people come to us and say, you know, our platform is burning. We’re unhappy with platform X and we want to replatform into platform Y. Now, what’s funnier is that usually we have some other customer that’s saying, I’m unhappy with platform Y and I need to go to platform X, right? So it’s just like a conveyor belt of sorts.
BS: You can’t please everybody.
SO: But the real question I think you have to ask before replatforming is, is the platform actually the problem here? Is it legitimately broken? And as you said, it’s evolved away from the business requirements to a point where they no longer meet your needs. And or there are some other questions to ask, like, what are your processes around that platform look like? Do you have weird, annoying, and inefficient processes?
BS: Mm-hmm.
SO: Do you have constraints that are going to force you in a direction that isn’t maybe from a technology point of view the best one? Have you made some old decisions that are now non -negotiable? So you’ll see people saying, well, we have this particular kind of construct in our content and we’re not giving it up ever.
BS: Mh-hmm.
SO: And you look at it and you think, well, it’s very unusual and is it really adding value, but it’s hard to get rid of it because it’s so established within that particular organization. So the worst scenario here is to move from A to B and repeat all the same mistakes that were made in the previous platform.
BS: Yeah, you don’t necessarily want to carry, well, you don’t want to carry that debt over, certainly. You know, so anything that you have established that worked well, but doesn’t meet your current or future needs. mean, absolutely. You do not want to move that forward. That being said, you have a wealth of content, a wealth of technology that you have built over the years and you want to make sure that you can use as much of that as possible to at least give yourself a leg up in the new system. So that you don’t have to rewrite everything from scratch, that you don’t have to completely rebuild your publishing pipelines. You might be able to move them over and change them and you might be able to move and refactor your content so that it better meets your needs. But I guess it’s a long way of saying that not only are you looking at a burning platform problem, but you’re also looking at a futureproofing opportunity. And you want to make sure that if you are going to do that lift and shift to another platform, that you, you take a few steps back and you look at what your current and future requirements are or will be and you make the necessary changes during the replatforming effort before you get into the new system and then start having to essentially deal with the same problems all over again.
SO: Yeah, I mean, to give a slightly more concrete example of what we’re talking about, relative to 10 years ago, PDF output is relatively less important. 10 years ago, we were getting a lot of, need PDF, we have to output it, and it has to meet these very, very high standards. People are still doing PDF, and clients are still doing PDF, but relatively, it is less of a like showstopper, primary requirement. It’s more, yes, we still have to do PDF, but we’re willing to negotiate on what that PDF is going to look like. Instead of saying it has to be this pristine and very complex output, they’re willing to drop that down a few notches. Conversely, the importance of HTML website alignment has gotten much, much higher. And we have a lot of requirements around Content as a Service and API connectors and those kinds of things. So if you just look at all your different publishing output connection pipelines 10 years ago PDF was really still unquestionably the most important thing and that’s not necessarily the case anymore.
BS: And on the HTML side, there’s also, could be HTML, could be JSON, but you do have a wealth of apps, whether it be a phone app or an app in your car or an app on your fridge that needs to be supported as well where your PDF certainly isn’t going to cut it. And a PDF approach to content design in general is not going to fly.
SO: So when we talk about replatforming, we tend to, in many cases, I look at this through the lens of, okay, we have, you know, DITA content in a CCMS and we’re gonna move it to another DITA CCMS. But in fact, it goes way, way beyond that, right? What are some of the, I guess, input or I’ll say legacy, but what are some of the formats that we’re seeing that are on the inbound side of a replatforming?
BS: Let’s see, on the inbound side, we certainly have maybe old models of DITA. So maybe something that was developed in DITA 1.1, 1.2, pre 1.0, something that’s heavily specialized. We have things like unstructured content, like Word files, InDesign, unstructured FrameMaker, and what have you. We’re also seeing that there’s an opportunity there as well to move a lot of developer content into something that is more centrally managed. In that case, we’ve got Markdown and other lightweight formats that need to be considered and migrated appropriately. And then, of course, all of your structured content. So we mentioned DITA. There’s DocBook out there. There are other XML formats and whatnot. And potentially, you have other things that you’re you’ve been maintaining over the years that now is a good opportunity to migrate that over into a system, centralize it, and get it aligned with all your other content.
SO: Yeah, and looking at this, I think it’s safe to say that we see people entering and exiting Markdown, like people saying we’re going to go from DITA to Markdown, but also Markdown to DITA. We’re seeing a lot of going into structured content in various flavors. Unstructured content, we largely are seeing as an exit format, right? We don’t see a lot of people saying, “Put us in Word, please.”
BS: No, no one’s going from something like DITA into Word.
SO: So they might go from DITA to Markdown, which is an interesting one. Okay, so I guess then that’s the entry format. That’s where you’re starting. What’s the outcome format? Where are people going for the most part?
BS: For the most part, there are essentially two winners. There are the XML-based formats, and then there is the Markdown-based formats. And I’m lumping DITA, DocBook, and other proprietary XML models all into XML. But generally, people are migrating more toward that direction than to Markdown. And there’s really a division there. It’s whether you want the semantics ingrained in an XML format and the ability to apply or heavily apply metadata. Or if you want something lightweight, that’s easy to author and is relatively, I don’t want to say single purpose, but it’s not as easily multi-channel as you can get with XML.
SO: Yeah, I mean the big advantage to Markdown is that it aligns you with the developer workflows, right? You get into Git, you’re aligned with all the source control and everything else that’s being done for the actual software code. And if that is a need that you have, then that is, you know, that’s the direction to go in. There are some, as Bill said, some really big scalability issues with that. And that can be a problem down the line, but Markdown generally, you know, okay, so we pick a fundamental content model of some sort, and then we have to think about software. So what does that look like? What are the buckets that we’re looking at there?
BS: For software, we’ve got a lot of things. First and foremost, there’s the platform that you’re moving to. What does that look like? What does it support? You have certainly authoring tools that are there. You also have all of your publishing pipelines. All of that’s going to require software to some degree. Some of it’s third party. Some of it’s embedded in the platform itself. And then you have all of your extended platforms that you are connecting to. Those might change. Those might stay the same. You might not change your knowledge base, for example, but you still need to publish content from the new system. The new system doesn’t quite work the way the old system did. So your connector needs to change. Things like that. I would also say that, you know, with regard to software, there’s also a hit. It’ll be a temporary blip, but it will be a costly blip in the localization space because when you are replatforming, especially if you are migrating formats to a new format, you’re going to take a hit on your 100% matches in your translation memory. So anything that you’ve translated previously, you’ll still have those translations, but how they are segmented will look very different in your localization software.
SO: Yeah, and there are some weird technical things you can do under the covers to potentially mitigate that, but it’s definitely an issue.
BS: And it’s still costly.
SO: OK, so we’ve decided that we need to replatform and we’ve done the business requirements and we picked a tool and we’re ready to go from A to B, which we are carefully not identifying because some of you are going from A to B and some of you are going from B to A. And it’s not wrong, right? There’s not a single, you know, one CCMS to rule them all.
BS: Mh-hmm.
SO: They’re all different and they all have different pros and cons. So depending on your organization and your requirements, what looks good for you could be bad for this other company. But within that context, what are some of the things to consider as you’re going through this? So you need to exit platform A and migrate to platform B.
BS: Mm-hmm. I think the number one thing you should not do is expect to be able to pick up your content from platform A and just drop it in platform B. Yeah, it’s never going to be that easy and it shouldn’t be something that you really are considering because not only are you replatforming, but you’re aligning with a new way of working with your content. So just picking it up and dropping it in a new system is not going to help you at all with in that regard. And given that you need to get the content out of the system, that’s the best time to look at your content and say, how do we clean this up? What mistakes do we try to erase with a migration project on this content before we put it in the new system?
SO: Yeah, I think the decisions that were made that tend to take on a life of their own, like this is how we do things. And much, much, much later you find out that it was done that way because of a limitation on the old software. This is like that dumb old story about, you know, cutting the end off the pot roast. And it turned out that Grandma did that because the roasting pan wasn’t big enough to hold the entire pot roast. It’s exactly that, but software, right? So bad decisions or constraints, you need to test your constraints to see whether your new CCMS, in fact, is a bigger roasting pan that does not require you to cut the end off the pot roast. What about customization?
BS: Customization is a good one. And what we’re finding is that a lot of the old systems or people who are exiting an older system for a newer system, they have a lot of heavy customization because there wasn’t a, in many regards, there wasn’t a robust content model available at the time. So they had to heavily specialize their content model and make it tailored to the type of content that they were developing. And now, you know, something that was built 10, 15 years ago that is using highly structured, specialized structured content. If you look at what’s available now, a lot of those specializations have been built into the standard in some way. So you can unwind a lot of that. It’s a great opportunity to unwind a lot of it and use the standard rather than your customization. That helps you move forward as the specifications for the content model change, you will be aligned with that change a lot better than if you had used a customization along the way. Specialization or any kind of customizations for that matter, you know, they’re expensive. They’re expensive to build. They’re expensive to maintain. They’re expensive to train people on. You know, they affect every aspect of your content production from authoring to publishing. There’s, something that needs to be specifically tailored, whether it’s training for the writers, whether it’s a training, you know, designing your publishing pipelines to understand and be able to render those customers, customized models, the translators that are involved, making sure that, you know, their systems can understand your tags if they’re custom so that they know whether, you know, that they can show and hide them from the translators and you don’t get translations back that contain translated tags, which we’ve seen. There’s a lot going on there. So the more that you can unwind, if you have heavily customized in the past, the better off you will be.
SO: Yeah, I think, mean, and here we’re talking, I think specifically about some of the DITA stuff. So if you’re in DITA 1.0 or 1.1 with your older legacy content, they added a lot of tags and did a 1.3 and they’re adding more and did a 2.0 that might address some of the things like you added a specialization because there was a gap or a deficiency in the DITA standard. So you could probably take that away and just use the standard tag that got added later. Now, I want to be clear that, I mean, we’re not anti-specialization. I think specialization is great and it’s a powerful tool to align the content that you have and your content model with your business requirements. And you have to make sure that when you specialize, all the things that Bill’s talking about, all those costs that you incur are matched by the value that you get out of having the specialization.
BS: Mm-hmm.
SO: So, you’re going to specialize because it makes your content better and you have to make sure that it makes it enough better to make it worthwhile to do all these things. Very, very broadly, metadata customization nearly always makes sense because that is a straight-up, we have these kinds of business divisions or variants that we need because of the way our products operate. And those nearly always make sense. And element specialization tends to be a bigger lift because now you’re looking at getting better semantics into your content. And you have to ask the question, do I really need custom things, or is this out of the box, did a doc book, custom XML content model good enough for my purposes? That’s kind of where you land on that. And then reuse, I did want to touch on reuse briefly because, you know, we can do a lot of things with reuse from reusing entire, you know, chunks, topics, paragraph sequences, list of steps, that kind of thing, all the way down to individual words or phrases. And the more creative you get with your reuse and the more complex it is, the more difficult it’s going to be to move it from system A to system B.
BS: Absolutely. It’ll be a lot more difficult to train people on as well. And we’ve seen it more times than not that even with the best reuse plan in mind, we still see, you know, what we call spaghetti reuse in the wild, where, know, someone has a topic or a phrase or something in one publication and they just reference it into another publication rather, you know, from one to the other. And it doesn’t necessarily, some systems will allow that. I’ll just put that out there. Other systems will absolutely say, absolutely not. You cannot do this. And you have to, you know, make sure that whatever you’re referencing exists in the same publication that, you know, that, that you’re publishing. so we’ve had to do a lot of unwinding there, you know, with regard to this spaghetti reuse and we’ve, we’ve had a podcast in the past with Gretel Kinsey on our side who I believe she talked extensively about spaghetti reuse. What it is what it isn’t and why you should avoid it. But yes as you’re replatforming if you know you have cases like this It’s best to get your arms around it before you put your content in the new system.
SO: Yeah, and we’ll see if we can dig it out and get it into the show notes. What about connectors?
BS: Connectors are interesting. And by that, we’re talking about either webhooks or API calls from one system to another to enable automation of publishing or sharing of content and what have you. For the most part, if you’re not changing one of the two systems, managing that connector can be a little bit easier, especially if it’s your target or the receiving end of the content is reaching out and looking for something else in like a shared folder using the webhook or using an FTP server, what have you. But generally, know, those webhooks can or sorry, those connectors can get a little sketchy. You know, it might be that your new platform doesn’t have canned connectors for the other systems that you have always connected to and need to connect to. So then you need to start looking at, well, do we need to build something new? we find a way of, find some kind of creative midpoint for this? They can get a little dicey. So I think it’s important to, before you re -platform, before you even choose your new content management system, that you look at where your content needs to go. And if you have support from that system to get you there.
SO: So a simple example of this is localization. If you have a component content management system of some sort, you’ve stashed all your content in, and then you have a translation management system. And the old legacy system, the platform you’re trying to get off of, has or maybe doesn’t have, but you need a connector from the component content management system over to the TMS, the translation management system, and back so that you can feed it your content and have the content returned to you.
BS: Mm-hmm.
SO: Well, if that connector exists in the legacy platform, but not in the new platform, you’re gonna have to either lean on the vendors to produce a new connector or go back to the old zip and ship model, which nobody wants, or conversely, you were doing a zip and ship in the old version, but the new version has a connector, which is gonna give you a huge amount of efficiency.
BS: Mm-hmm.
SO: The connectors tend to be expensive and also they add a lot of value, right? Because if you can automate those systems, those transfer systems, then that’s going to eliminate a lot of manual overhead, which is of course why we’re here.
BS: Mm hmm. Human error as well.
SO: So they’re worth looking at, you know, pretty carefully to see what that connector, as you said, Bill, you know, what’s out there, what already exists. Does the new platform have the connectors I need? And if not, who do I lean on to make that happen so that I don’t go backwards, essentially, in my processes? Okay, anything else or should we leave it there?
BS: I think this might be a good place to leave it. We could talk for hours on this.
SO: Be good place to leave it. Let’s not and say we did. OK, so with that, thank you for listening to the Content Strategy Experts podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Cutting technical debt with replatforming (podcast) appeared first on Scriptorium.
Just like discovering faulty wiring during a home renovation, technical debt in content operations leads to unexpected complications and costs. In episode 171 of The Content Strategy Experts podcast, Sarah O’Keefe and Alan Pringle explore the concept of technical debt, strategies for navigating it, and more.
In many cases, you can get away with the easy button, the quick-and-dirty approach when you have a relatively smaller volume of content. Then as you expand, bad, bad things happen, right? It just balloons to a point where you can’t keep up.
— Sarah O’Keefe
Related links:
LinkedIn:
Transcript:
Alan Pringle: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about technical debt and content operations. What is technical debt and can you avoid it? Hey everybody, I am Alan Pringle and I’ve got Sarah O’Keefe here today.
Sarah O’Keefe: Hey everybody.
AP: And we want to talk about technical debt, especially in the context of content operations. And to start off, we should probably have you define what technical debt is, Sarah. I think this is something most people run into during their careers, but they may not have had a label to apply to what they were dealing with. So what is technical debt?
SO: We usually hear about technical debt in the context of software projects. And it is something along the lines of taking the quick-and-dirty solution, which then causes long-term effects, causes long-term costs. So Wikipedia says it’s the implied cost of future reworking because a solution prioritizes expedience over long-term design. And that’s really it. You know, I have this thing, I need to deliver it this week. I’m going to get it done as fast as possible. But then later, I’m going to run into all these problems because I took the easy road instead of the sustainable
AP: So it’s basically when the easy button bites you in the backside weeks, months, years later.
SO: Yeah, and with any luck you are aware that you’re incurring technical debt. The one that’s really painful is when you don’t realize you’re doing it.
AP: Right, or you didn’t know you weren’t part of the process when it happened. And I think this is kind of moving into where I want to go next. Let’s talk about some examples, especially in the context of content of where you can incur or stumble upon technical debt.
SO: So right now, the example that we hear actually most often is that any inconsistencies and problems in the quality of your content, the organization of your content, and the structure of your content lead to a large learning model or AI misinterpreting information and therefore your generative AI strategy fails. So essentially, because the content isn’t good enough, genAI you know, tries to see patterns where there are none and then produces some stuff that’s just complete and utter junk. Now, the interesting thing about this is that probably you are aware, at least at a high level, that your content wasn’t perfect. But the LLM highlights that it’s like, it’s like a technical debt detector. It will show that, look at you, you took a shortcut and it didn’t work or you didn’t fix this and it didn’t. And so here we are. Another good example of this is any sort of manual formatting that you’re doing. So you’re producing a bunch of content, a bunch of docs, a bunch of HTML pages, PDF, whatever. And in the context of that, you’ve got some step in there that involves cleaning it up by hand. So I get it sort of 90—95% is I just apply the template and it all just works. But then I’ve got this last step where I’m doing a couple of little finicky cleanup things and that’s okay because it’s just an hour or two and all I’m delivering is English. Okay, well along comes localization and suddenly you’re delivering in not just one language but two or three or a dozen or 27 and what looked like one hour in English is now 28 hours, you know, once time for English and 27 times again where you’re having to do this cleanup. And so all of a sudden your technical debt balloons into something that’s basically unsustainable because that choice that you made to not automate that last 5% suddenly becomes a problem.
AP: It’s a scalability issue, really, at the core.
SO: Yeah, in many cases, you can get away with the sort of, as you said, the easy button, the quick-and-dirty approach when you have a relatively smaller volume of content. And then as you expand, bad, bad things happen, right? It just balloons to a point where you can’t keep up.
AP: Yeah, and I have recently run into some technical debt, not in the content world, but in the homeownership world. And I’m sure this painful story will resonate with many people and not in a good way. But how many times have you gone to update a kitchen, update a bathroom, only to discover that there was some weird stuff done with the wiring? The plumbing is not like it really should have been. And basically you want to jump into a time machine, go back to when your house was built to have either a gently corrective conversation with the people who are building your house or just murder them outright because you are now having to pay to untangle the mess that was made 30, 40, 50 years ago. I am there right now and it is not a happy place.
SO: And it would have been, whatever it was they did was presumably cheaper than doing it right. But what they actually paid to do it the cheap way, plus what it would have cost to do it right, you know, would have been an extra 5 % or whatever at the time. But now it’s compounded because you’re having to, you know, in the case of plumbing, you know, tear out walls and go back and replace all these pipes instead. So you have to essentially start over instead of just do it. Another great example of this is accessibility. So when you start thinking about a house that has grab bars or wide doorways that wheelchairs will fit through, right? If the house was built with it, it costs a little bit more, not a lot, but a little. But when you go back to retrofit a house with that stuff, it is stupidly expensive.
AP: Exactly. And really, these things that we’re talking about in the physical world very much apply when you’re talking about software infrastructure, tool infrastructure as it can be bad.
SO: Yeah, I mean, there’s a perception of it’s just software, right? We’re not doing a physical build. We’re not using two by fours. So how bad could it be? It can be real bad. But that is the perception, right? That we’re not building a physical object so we can always go back and fix it. And I mean, you can always go back and fix everything. It’s just how much is it going to cost?
AP: Right, how much time and money and effort is it going to suck up to get you to where you need to be so you can then do the next thing that you intended to actually do in the first place? So yeah, I think this is something where this technical debt, sometimes there is no way around it. You inherit a project, you’ve got some older processes in place and you’re gonna have to deal with it. Are there some strategies that people can rely on to kind of mitigate and make it less painful?
SO: Well, first I’ll say that not all technical debt is bad or destructive in a way. And the canonical example of this is if you’re trying to figure out is this thing gonna work, I wanna do a proof of concept, I don’t wanna see if the strategy that I’m considering is even feasible. So you go in and you take a small amount of content and you build out a proof of concept or a prototype, proof of concept like, look, we were able to generate this PDF over here and this HTML over here, and we fed it into the chat bot and everything kind of worked. And you look at it and you say, okay, so that was good enough. And because it was a proof of concept, you maybe didn’t sort of harden it from a design point of view. You just did what was expedient and you got it done. That’s fine, provided that you go into this with your eyes open, knowing where you cut the corners, recognizing that later we’re going to have to do this really well and we probably can’t use the proof of concept as a starting point, or it’s good enough and we can use it as a starting point, but here’s where we cut all the corners. You have this list of like, we didn’t put in translational localization support, we didn’t put all the different output formats we’re going to need, we just put in two to prove that it would more or less work. But I think you made a really good point earlier. So often you inherit these things. So you walk into an organization and you’re brand new to that organization and you get handed a content ops environment. This is how we do things. Great. And then the next thing that happens is that genAI comes along or a new output format comes along or, we’ve decided we want to connect it to this other software system over here that we’ve never thought about before, or, hey, we’re bringing in a new enterprise resource planning system and we need to connect to it, which was never on the requirements day one. And now you realize, looking at your environment, that what’s there won’t, you can’t get from what you have to where you need to be because the requirements shifted underneath you. Or you came in and you just didn’t have a good understanding of how and when these decisions were made because it was five or 10 years ago with your predecessor kind of thing. So. So how do we deal with this? It’s I mean, it just sounds awful, but it’s like you have to manage your debt just like actual debt.
AP: All right, sure.
SO: Right, so understand what you have and haven’t done. We have not accounted for localization. We’re pretty concerned about that if and when we get to a point where we’re doing localization. Scalability. We are only going to be able to scale to maybe 10 authors and if we end up with 20, we’re going to have a big problem. So let’s just be aware of that when we get to eight or nine. But the thing is you always have technical debt that you identify that you know about this is hopefully unlike personal finance, you always have more debt than you think you have, right? Because in the content world, things change. Or in your housing example, like the building code changes. So they built the thing, umpteen years ago, and it was okay in the sense that it conformed with the requirements of the building code at the time, I assume.
AP: Of course.
SO: And now you’re going in and you’re making updates and suddenly the new building code is in play and you’re faced with the technical debt that accrued as the building code changed, but your house, your physical infrastructure did not change. And so there’s a gap between where you need to end up and where you are, part of which is just time has elapsed and things have changed.
AP: Right, and that is very true of some of the requirements you mentioned in regard to content operations. Generative AI, that’s what, the past two years, if that, that wasn’t on the horizon five years ago when some decisions were made. it absolutely is very much parallels. And when it comes to personal finance, sometimes things get so bad, you have to declare bankruptcy. And I think that can also apply to technical debt as well.
SO: Yeah, it’s a, you know, it’s an unhappy day when you look at, you know, a two-story house and you’ve been told to build a 50-story skyscraper. It just can’t be done, right? You cannot take a, you know, a sort of a stick-boiled house made of wood and put 50 stories on top of it. At least I don’t think so. We’ve now hit the edges of what I know about construction. So sorry to all the construction people, you build differently if you know that it’s going to be required to be 50 stories. Even if you only build the initial two, so either you build two knowing that eventually you’ll scrape it and start over with a new foundation or you build what amounts to a two-story skyscraper, right, that you can then expand on as you go up. So you overbuild, mean, completely overbuild for two stories knowing you’re going forward.
AP: Scalability.
SO: But yeah, we have a lot, a lot of clients who come in and say, you know, we’re in unstructured content, know, word unstructured frame maker, InDesign, basically a PDF-only workflow. And now we need a website or we need all of our content in like a content as a service API kind of scenario. And they just can’t get there from a document page-based, print-based, PDF-targeted workflow, you can’t get to, and also I wanna load it into an app in nifty ways. I mean, you could load the PDF in, but let’s not. So you end up having to say, this isn’t gonna work. This is the, I have a two-story suburban house and I’ve been told to build a 50-story skyscraper. Languages, localization are really, really common causes of this. So separately from the, “I need website, in addition to PDF,” the, “We’re only going to one or two languages, but now we’re going to 30 because we’re going into the European Union,” is a really, really common scenario where suddenly your technical debt is just daunting.
AP: So basically you’re in a burn it all down situation. Just stop and start all over again.
SO: Yeah, I mean, your requirements, it’s not that you did it wrong. It’s that your requirements changed and evolved and your current tools can’t do it. So it’s a burning platform problem, right? The platform I’m on isn’t isn’t going to work anymore. And so I have to get to that other place. It’s really unpleasant. Nobody likes landing there because now you have to make big changes. And so I think ideally, what you want to do is evolve over time, evolve slowly, keep adding, keep improving, keep refactoring as you go so that you’re not faced with this just crushing task one day. But with that said, most of the time, at least the people we hear from have gotten to the crushing horror part of the world because it’s good enough. It’s good enough. It’s not great. We have some workarounds. We do our thing until one day it’s not good enough.
AP: And it’s very easy to get used to those workarounds. That is just part of my job. I will deal with it. You kind of get a thick skin and just kind of accept that’s the way that it is. While you’re doing that, however, that technical debt in the background, it’s accruing interest, it’s creeping up on you, but you may not really be that aware of.
SO: Right. Yeah, I’ve heard this called the missing stair problem. So it’s a metaphor for the scenario where, again, in your house or in your life, there’s a staircase and there’s a stair missing and you just get used to it, right? You just climb the steps and you hop over the missing stair and you keep going. But you bring a guest to your house and they keep tripping on the stairs because they’re not used to it, at which point they say, what is the deal with the step? And you’re like, yeah, well, you just have to jump over stair three because it’s not there or it’s got a, you know, missing whatever. So missing stair is this idea that you can get, you can get used to nearly anything and the workaround just becomes, “Get used to jumping.”
AP: And it ties into again, there’s technical debt there, but you have kind of almost put a bandaid on it. You’re ignoring it. You’ve just gotten used to it. Yeah, you do. So really, there’s no way to prevent this? Is it preventable?
SO: I mean, if you staffed up your content ops organization to something like 130% of what you need for day-to-day ops and dedicated the extra 30 or maybe 10%, but you know the extra percentage to keeping things up to date and constantly cleaning up and updating and refactoring and looking at new and yeah so no there’s no way to do it and everybody is running so lean.
AP: I’m gonna translate that to a no. That is a long no. So yeah.
SO: And as a result, you make decisions and you make trade-offs and that’s just kind of how it is. I think that it’s important to understand the debt that you’re incurring, to understand what you’re getting yourself into. And, you know, I don’t want to, you know, beat this financial metaphor to death, but like, did you take out like a reasonable loan or are you with the loan sharks? Like how bad is this and how bad is the interest going to be?
AP: Yeah, so there’s a lot to ponder here and I’m sure a lot of people are listening to this and thinking, I have technical debt and I’ve never even thought about it that way. it is a topic that is unpleasant, but it is something that needs to be discussed, especially if you’re a person coming into an organization and inheriting something you may not have had any say in the decisions that were made 10 years ago, five years ago, and things have changed so much that might be why they’ve brought you in. So it is something that you’re gonna have to untangle.
SO: Yeah, sounds about right. So good luck with that. Call us if you need help, but sorry.
AP: Yeah, so if you do need help digging out of the pit of technical debt, you know where to find us. And with that, I’m going to wrap up. Thank you, Sarah. And thank you for listening to the Content Strategy Experts podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
SO: Thank you.
The post Renovation revelations: Managing technical debt (podcast) appeared first on Scriptorium.
In episode 170 of The Content Strategy Experts podcast, Bill Swallow and Christine Cuellar dive into the world of content localization strategy. Learn about the obstacles organizations face from initial planning to implementation, when and how organizations should consider localization, localization trends, and more.
Localization is generally a key business driver. Are you positioning your products, services, what have you for one market, one language, and that’s all? Are you looking at diversifying that? Are you looking to expand into foreign markets? Are you looking to hit multilingual people in the same market? All of those factors. Ideally as a company, you’re looking at this from the beginning as part of your business strategy.
— Bill Swallow
Related links:
LinkedIn:
Transcript:
Christine Cuellar: Welcome to the Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we are talking about content localization strategy. So maybe you’re starting to think about introducing a localization strategy. Maybe you’re hitting some pain points in your localization processes, all that good stuff we’re going to be talking about today. Hi, I’m Christine Cuellar.
Bill Swallow: And I’m Bill Swallow.
CC: Bill, thanks for being here today to talk about localization. Bill is our go-to localization expert, and localization has been coming up a lot. So I noticed for me, on the marketing side of things, there’s been a lot of, you know, SEO stuff coming up for localization. People seem to be searching about it, asking questions at a more beginning to thinking about the whole localization process level. So that’s what we wanted to talk about today. Give you the chance to have some upfront knowledge about what you could be getting into with introducing localization in your content strategy. And yeah, let’s talk about it with an expert. So thanks, Bill.
BS: Thank you.
CC: First things first, the most basic question, what is content localization strategy? So what do we mean by that?
BS: Okay, so I can kind of frame this in, I guess the same point of view as a content strategy, but basically you’re taking a look at your entire localization process from start to finish. Plus you’re looking at what are the systems that are involved? How are authors prepping the content for localization? Are they writing well upfront? What does the publishing preparation look like? How are you choosing your translators? Are you going to pure machine translation? Are you using live people to do the translation? Are you using people who are content experts? Are you using people who are market experts? So there are a lot of different factors there that all kind of get balled up into this grander strategy of how are you going to approach getting your content authored and translated appropriately in other regional markets.
CC: Yeah, okay. That makes sense. And taking a step back even further, can you walk me through the difference between localization and localization strategy?
BS: Sure. Localization itself is kind of more of an action, and whereas strategy is more planning around that action, I think that’s the best way to put it. So localization involves a bunch of different things. It involves the act of internationalization. So that’s prepping your content, your code, your product, whatever it is to be delivered for multiple regional and language markets. And then you have the translation component of localization, which is actually getting things written, spoken however, in other languages. And the strategy piece is more bridging both of those and adding additional components so that you have a solid plan for every step in that process.
CC: Okay, yeah, that makes sense. And where do we step in? We here at Scriptorium, where do we sit?
BS: Generally we at Scriptorium, we sit on the source content authoring side. And we look at the overall content strategy, and we do look at a localization strategy as a component of that. They’re not separate. They’re very intertwined and we need to take a look at really both of them. So a lot of our clients do come to us because they have localization requirements.
And we have to account for those in the content strategy that we build for them. So we’re looking not only at the source content authoring process and what needs to happen in that to get the job done, but we also have to look at where are they going with their content, how are they going to localize it, what do they need to localize, what processes do they have in place now? Are they working? Are they not looking at systems? Are they adequate? Are they not? And look at the markets. Are they already reaching those markets? Do they need to do something different? How do we need to position the content as it moves through that funnel of production so that when it comes out the other side, it is ready for those markets. So they’re kind of intertwined there.
CC: Okay. Yeah. So when are organizations typically thinking about a content localization strategy?
BS: Well, localization generally it’s a key business driver. Are you positioning your content for one market, one language, and that’s all? Or are you positioning… I shouldn’t say just product because product services, what have you. Are you looking at diversifying that? Are you looking to expand into foreign markets? Are you looking to hit multilingual people in the same market? All of those factors. So ideally as a company, you’re looking at this from the beginning as part of your business strategy. And what are you doing to… What are you producing? Who are you producing it for? How do they need to consume it? So as soon as you catch a whiff of those multilingual requirements, bells should be going off saying, “Hey, we need a plan for this.” More commonly, an organization might be producing for one market or producing for several markets. They’re kind of doing things ad hoc, producing content, then sending it out to a translator. They’re getting something back, they may be polishing it up or it’s a finished product and then they send it out. It’s a very time-consuming process. It’s a very costly process, and it’s very difficult to kind of juggle when things will be done. Because if you don’t have a set process around things and you don’t have an idea of how long things will take, what efficiencies you’re able to build up front and so forth, you’re throwing caution to the wind and just putting stuff out there and hoping that it comes back in time so that you can go to market with it. We’ve worked with clients who have said that generally it takes about nine months or so to get their localized product out the door and into the market after the English is done. And for a lot of those, we’ve brought that number into three months, one month, depending on exactly what they’re producing and how they need to produce it, so-
CC: Yeah, it’s a huge difference.
BS: Looking at that… Oh, huge difference. And looking at that time to market, that’s perhaps more valuable than the cost that you’re dumping into putting a localization strategy or a content strategy together because you’re able to sell quicker into those markets. You’re not waiting for the opportunity to start seeing revenue come back from the initiatives that you’re taking to get stuff out there.
CC: Yeah. Yeah, that makes sense. And I feel like… So correct me if I’m wrong here, but in the global world that we live in, it feels like localizing products and getting them ready for new regions is a very… I think that would be something that executives think about from the get-go like, yes, of course we want our product ready for new regions and locations. But why is the… It sounds like maybe the content piece of that is not thought about or maybe left behind until it’s an absolute emergency. Would you say that that’s… First of all, is that accurate?
BS: Sadly, I’d say yes.
CC: Okay.
BS: Content is often an afterthought in general, whether we’re talking about producing stuff just in your native language for a native market. Localization is usually even more of an afterthought because it’s like, oh, well, we wrote it in English, we’ll just have someone translate it. And by then you’re waiting until that product is done and then sending it to somebody else who’s looking at it going, “I can’t make sense of this. It’s not written well. And I’m going to take my best guess at how to translate this.” It could take months to get that back.
CC: So maybe organizations see the value in having their products and services available in other markets, but they don’t necessarily think of all of the content localization pieces that are involved in getting that out the door.
BS: No, and it’s similar for pretty much anyone trying to get anything done that you want to do something. But for example, I really want to put a new patio in the back of my house. I know exactly… I even have an idea of exactly how that should go in. I don’t have the time. I don’t have the materials needed to do it. And I’d much rely on somebody else who knows what they’re doing to put it in the correct way so it’s not graded improperly, so that there aren’t uneven portions that people will trip over and so forth. So looking at it that way, the same thing with localization. People who are running a company or starting a company, they may have an idea that yes, they need to get from point A to point B to point C to point D. They don’t know those steps along that path, and they need some help figuring out, okay, it’s not that you just write your English content, you throw it over to somebody else and they send it back. It’s a more intricate process. You have some systems in place that we’ll manage that handoff that will allow people to gate the content and proof it and make sure it’s correct before it goes anywhere. And you may have some other efficiencies built in that allow you to automatically format things when the time comes to actually produce. So there are a lot of bits and pieces that people just generally don’t think about because it’s not in their wheelhouse.
CC: Yeah, they can’t know what they don’t know.
BS: Exactly.
CC: Okay. So it sounds like most organizations realize that this is a problem once they’re actually trying to get their product out the door and into a new market, into a new region. What are some obstacles to getting a content localization strategy set up? I’m sure that one issue is probably like, oh, you’re in emergency mode and we just need to get this product out the door. That might present a challenge in and of itself.
BS: Absolutely.
CC: Yeah. Are there other obstacles as well to getting a more future-focused strategy in place?
BS: Oh, that one is a good one. That is the first hurdle to get over.
CC: Is the emergency mode.
BS: So being able to recognize or realize that you’re in emergency mode and getting out of that mindset and saying, okay, it’s not just that this will be a forever problem of just waiting and hoping for good quality coming out in the end. Once you’re able to realize that you need to break that mindset and start looking forward, then we start hitting other obstacles. One of them is going to be funding because there will be systems involved, there will be personnel required, there will be processes that need to change and so forth. And that will certainly cost a lot upfront. You’re going to basically see that return on investment in a pretty quick amount of time. We’ve seen one company make their investment back within a year, but they were producing an insane amount of languages already, and they just needed to tidy up their process. And again, by bringing that window in from nine months to about a month and a half or so, to be able to get their localized stuff out, they were able to quickly realize that return on investment there. But another one is buy-in, because you have a lot of people who are busy doing their job and you’re suddenly telling them that they need to change how they do their job, and it might be abandoning the tools that they like to use. Writing in a different way, looking at publishing in a different way and interacting with people who they normally don’t interact with on a day-to-day basis. So your source author’s interacting with a localization manager internally who needs to send stuff out to translators or your writer’s interacting with translators to explain what they had written so that the translator has a definitive idea of what it is and how to translate it for the market that they’re translating for. And then of course, you have the obstacle of governance and change management comes along with that. You need to be able to make sure that any of the changes that you introduce, that people are following the new way of doing things and aren’t falling back to old bad habits or even old good habits at the time. And you need to make sure that you have these gating processes so that once something is written in English, you have a formal review on that to make sure it’s correct, to make sure it’s written appropriately. That goes out to translation. They have their own gating process of making sure they receive all the files, that they understand the content that they have, all the supporting information that they need to help them translate and localize this information for that market. Then of course, they do their own quality checks. It comes back, you make sure that there’s a final review on the company side to make sure the translation seems good. And then you’re able to publish and deliver. So it still sounds like a lot of gating factors, but once you kind of get things going and figuring out where you can expedite and make things a lot easier, you start to bring in that entire timeline.
CC: Yeah, that makes sense. You mentioned buy-in, and so I could see how if people feel like their workload’s being increased by suddenly needing to talk to more people, coordinate between more departments or even just have more things on their radar, I could see how that could create a lot of, oh, I don’t know if I want to go in this direction. What are some ways… And that’s probably one of the… As you mentioned, that’s just one of a few buy-in challenges. What are some of the ways that you maybe win people over or show people how this can benefit their work life versus just make it harder?
BS: That’s a good question. I think that authors in general to understand where their content is going and who is consuming it. And even though it’s… We’re talking about corporate content, we’re talking about everything from website content to product manuals to troubleshooting tips and all that stuff and training materials. So it’s not really… Even though it belongs to the company, a lot of authors tend to have a kind of, I guess, personal pride built around what they write.
CC: Yeah, okay. Yeah, that makes sense.
BS: So knowing who is consuming it down the road and the reason why you have these additional checkpoints and processes in place will kind of help, I think get a lot of them around the idea of, yeah, this is a good thing and I’m looking forward to helping any way I can. Because the last thing they want is to have something written completely correctly in English and have it go out to, I guess let’s say a market in Denmark. And the content was translated incorrectly because the translator maybe didn’t understand what something meant, and they gave it a different term, which had a different meaning in that market.
CC: Yeah. And I could also see from a safety standpoint, that could be really dangerous too, if you’re not properly translating instructions for high stakes content, medical devices, stuff like that. Just like you do in English, you want that content to be accurate and understandable. Because if it’s not accurate, of course it’s wrong and people could get hurt also if people don’t understand it, even if it’s totally accurate. But it’s just hard to understand. That presents, I’m sure, a lot of dangerous situations where your people could get hurt and your company is liable. So yeah, it makes sense that you would really want to have a good process in place.
BS: Oh, absolutely. And even more along those lines, the regulations that we have to adhere to here in the US are somewhat different to… Very different to anywhere else in the world. There are different directives in place depending on where you are regionally, things that have to be included that have to be said a very specific way. So I guess the easiest way to look at it is that there are more legal ramifications in the US. So you could get sued if something is wrong, whereas opposed to if you go over to the UK, it’s generally more that there’s a directive you have to follow and you simply cannot release in that market if your, for example, machinery content does not meet that specific directive’s requirements. So there’s a slightly different approach. So it might be… There’s still a legal ramification if things go wrong, but there’s also another set of requirements that need to be met before you even start worrying about the legal stuff.
CC: And are most organizations aware of those kind of requirements when they start trying to get into a new market?
BS: Some of them might be, but again, if you’re in one particular region, chances are that’s the region you’ve grown up with and that’s the region you understand. And there’s been very little attention paid to what are their requirements in other geographic regions, other countries and so forth. So I can’t say is it common, is it not common? But in general, you know what? And when you’re looking to move to a foreign market, there’s the foreign context. You’re going to have very little insight into what that foreign market demands by its very nature. As a company moves into a new language market, new geographic market, they’re going to learn things as they go, and they’re going to bring that knowledge back and refine how things are being done currently so that it also satisfies that new requirement. And it’s going to be an iterative process until they really get their arms around it. And again, going back to a localization strategy for your content, you can kind of start putting those feelers out. Because if one market has one set of requirements, it’s like, wait a minute, now we want to go to three. What are the requirements for the other two before we even start thinking in that direction? So you’re able to start building upon that strategy that you’re developing. I mean, we’re not experts in all the requirements for every single market on the face of the earth. I can say that outright, but we can help companies start to identify what they need to start looking into before they start running.
CC: So since we mentioned one of the reasons this topic came about was seeing some SEO search trends, people trying to get more information on localization. What other trends are you seeing in localization right now?
BS: I think the big one is still going to be machine translation. It’s continually evolving and it’s getting smarter, still not, I would say, better than a human. It’s certainly quicker, but we’re getting there. And a lot of that… We talk about AI a lot. And obligatory nod to AI for this podcast, but when we talk about AI, and I think I mentioned this on another podcast already, that when you look at machine translation, that was really like AI Alpha or AI Beta where it was already using an algorithm to start putting together translations for written text. So with AI in the mix now, we’re getting a lot more, I guess, interesting results, a lot more targeted results with machine translation. I still don’t think it’s a perfect solution, and we’ll certainly need some proofreading, but it’s come a long way. And I think that that trend is certainly not going to fall off the radar anytime soon. In fact, recently Sarah O’Keefe had a podcast with Sebastian Göttel about strategies for AI and technical documentation, and they actually recorded that podcast in German. And they used AI to translate and voice augment into English. So not only were things machine translated from German into English, but the German speaking was then synthetically reproduced in English, which just is really cool.
CC: Yeah, it’s super cool to listen to, and we’ll link those in the show notes as well. There’s two versions, the German version and the English version. But yeah, you’re right. It was a super cool process, but you had mentioned earlier there was a human piece to it that was still needed because when it was originally recorded in German, then we got the German transcript, translated that into English. And when we translated that, at first it was Google Translate just to get it all done, but then Sarah needed to go and check it because she speaks both English and German. And we needed that human element to make sure that the translation was correct. Because like you were saying, you can’t just necessarily put it into a machine and cool, yay, it’s done. We need the human to make sure that it was actually translated properly and the things make sense. And we did notice once Sebastian’s synthetic audio was created in English, a lot of the prompts or the questions just were different lengths. The English version sometimes was shorter or sometimes longer of just the exact same question. It’s just the languages are different. So it’s really cool. It was a really cool experiment and does open up some interesting possibilities, would you say, for localization. And we’ve never been able to have a German and English podcast before, so that’s kind of cool.
BS: Yeah, no, it was very cool. I sat in the back of the room just watching the entire process, but it was definitely something I was quite interested in seeing. Yeah, there was a lot of editing of the English translation because again, it was pure machine translation and it needed some help. But once that was done, the synthetic audio really came right together, and I was impressed in how that happened.
CC: Yeah. And it’s so interesting because it’s definitely… It sounds like Sebastian, but then also it sounds not quite human, but it’s really close. It’s really interesting. But it did-
BS: Very uncanny valley.
CC: Yeah, it was, and I only speak English. I don’t speak German, so it made that podcast accessible to me. I was able to listen to it, and it does present some interesting opportunities, but as always with AI, the human element was definitely needed. It was very important to make sure that the humans at the other end of the screen could eventually consume it.
BS: Oh, yeah.
CC: Awesome. Well, bill, thank you so much. We covered a lot of ground today, and we really appreciate it. This was really helpful, and yeah, thanks for being on the show.
BS: Yeah, thanks.
CC: And thank you for listening to the Content Strategy Experts podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Accelerate global growth with a content localization strategy appeared first on Scriptorium.
In episode 169 of The Content Strategy Experts podcast, Sarah O’Keefe and special guest Sebastian Göttel of Quanos engage in a captivating conversation on generative AI and its impact on technical documentation. To bring these concepts to life, this English version of the podcast was created with the support of AI transcription and translation tools!
Sarah O’Keefe: So what does AI have to do with poems?
Sebastian Göttel: You often have the impression that AI creates knowledge; that is, creates information out of nothing. And the question is, is that really the case? I think it is quite normal for German scholars to not only look at the text at hand, but also to read between the lines and allow the cultural subtext to flow. From the perspective of scholars of German literature, generative AI actually only interprets or reconstructs information that already exists. Maybe it’s hidden, only implicitly hinted at. But this then becomes visible through the AI.
How this podcast was produced:
This podcast was originally recorded in German by Sarah and Sebastian, then Sarah edited the audio. Sebastian used Whisper, Open AI’s speech-to-text tool to transcribe the German recording, followed by necessary revisions. The revised German transcript was machine translated into English via Google Translate and then we cleaned up the English transcription.
Sebastian used ElevenLabs to generate a synthetic audio track from the English transcript. Sarah re-recorded her responses in English and then we combined the two recordings to produce the composite English podcast.
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Today’s episode is available in English and German. Since our guest works with AI in German-speaking countries, we had the idea to create this podcast in German. The English version was then put together with AI support, particularly synthetic audio. So welcome to the Content Strategy Experts Podcast, today offered for the first time in German and English. Our topic today is Information compression instead of knowledge creation: Strategies for AI in technical documentation. In the German version, we tried to put it all together in one nice long word, but it didn’t quite work. Welcome to the Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about best practices for AI and tech comm with our guest Sebastian Göttel of Quanos. Hello everyone, my name is Sarah O’Keefe. I am the CEO here at Scriptorium. My guest is Sebastian Göttel. Sebastian Göttel has been working in the area of XML and editorial CCMS systems in technical documentation for over 25 years. He originally studied computer science with a focus on AI. Currently, he is Product Manager for Schema ST4 at Quanos, one of the most used editorial systems in machinery and industrial engineering in the German-speaking regions. He is also active in Tekom and, among other things, contributed to version 1 of the iiRDS standard. Sebastian lives with his wife and daughter, three cats, and two mice just outside Nuremberg. Sebastian, welcome. I look forward to our discussion. In English, we say create once, publish everywhere. This is about recording once and outputting multiple times. So, off we go. Sebastian, our topic today is, as I said, information consolidation instead of knowledge creation and how this strategy could be used for AI in technical documentation. So please, explain.
Sebastian Göttel: Yes, first of all thank you for inviting me to the podcast. It’s not that easy to impress a 14-year-old daughter. And I thought, with this podcast I have a chance. So I told her that I would be talking about AI on an American podcast soon. And the reaction was a little different than I expected. Youuuuu will you speak English? You can put quite a lot of meaning into a single “uuuu” like that. And that’s why I’m glad that I can speak German here. But, and this is now the transition to the topic, what will the AI make of the “You will speak English”? How does it want to pronounce that correctly in text-to-speech or translate it into another language? And that’s what I think our conversation will be about today. If we want to understand how AI understands us, but also how we can use it in technical documentation, then we have to talk about information compression, but also invisible information. “You will speak English?” Can the AI conceptualize that my daughter doesn’t trust me to do this or simply finds my German accent in English gross? Well, if the AI can understand that, then it is new information or actually information that was already there and that both father and daughter were actually aware of during the conversation. I find it quite exciting that German scholars have often dealt with this. Namely, what is in such a text, and what is meant in the text? What’s between the lines? And when you think back to your school days, these interpretations of poems immediately come to mind.
SO: So poems. And what does AI have to do with poems?
SG: Yes, well, you often have the impression that AI creates knowledge; that is, creates information out of nothing. And the question is, is that really the case? I think it is quite normal for German scholars to not only look at the text at hand, but also to read between the lines and allow the cultural subtext to flow. And from the perspective of scholars of German literature, generative AI actually only interprets or reconstructs information that already exists. Maybe it’s hidden, only implicitly hinted at. But this then becomes visible through the AI. Wow, I never thought I would refer to German literature scholarship in a technical podcast.
SO: Yes, and me neither. But the question remains, how does AI work and why does it work? And then why do these problems exist? What is our understanding of the situation today?
SG: Well, I think we’re still pretty impressed by generative AI, and we’re still trying to understand what we’re actually perceiving and what’s happening there. There are things that just make our jaws drop. And then there are those epic fails again, like this recent representation of World War II German soldiers by Gemini, Google’s generative AI. According to our current understanding, the soldiers were politically correct. And there were, among other things, Asian-looking women with steel helmets. I always like to compare this with the beginnings of navigation systems. There were always these anecdotes in the newspaper about someone driving into the river because their navigation system mistook the ferry line for a bridge. It was relatively easy to fix such an error in the navigation system. It was clear why the navigation system made the mistake. Unfortunately, with generative AI it’s not that easy. We don’t know, actually, we haven’t even really understood how these partially intelligent achievements come about. But the epic fails make us aware that it’s not an algorithm, but a phenomenon that seems to emerge if you pack many billions of text fragments into a matrix.
SO: And what do you mean here by “emerge”?
SG: That is a term from natural science. I once compared it to water molecules. A single water molecule isn’t particularly spectacular, but if, for example, you’re sailing in a storm on the Atlantic or hitting an iceberg, you get a different perspective. Because if you put many water molecules together, completely new behavior emerges. And it took physics and chemistry many centuries to partially unravel this. And I think we will, maybe not for quite as long, but we will have to do a lot more research into generative AI in order to understand a little more about what exactly is happening. And I think the epic fails should make us aware that we would currently do well not to blindly place our fate in the hands of a Large Language Model. I think the human-in-the-loop approach, where the AI makes a suggestion and then a human looks at it again, remains the best mode for the time being. The translation industry, which feels like it is a few years ahead of the world when it comes to generative AI or neural networks, has recognized this quite cleverly and implemented it profitably.
SO: And if translation is the model, what does this mean for generative AI and technical documentation?
SG: That’s a good question. Let’s take a step back. So at the beginning of my working life, there was a revolution in technical documentation, these were structured documents; SGML and XML. This has been known for several decades now, and it is still not used in every editorial team. And that means we now have these structured documents and the other thing, which are the nasty unstructured documents. I always thought that was a bit of a misnomer because unstructured documents are actually structured. Well, at least most of the time. There’s a macro level where I have a table of contents, a title page, and an index. There are chapters. Then there are paragraphs, lists, and tables and that goes down to the sentence level. I have lists, prompts, and so on. It’s not for nothing that some linguists call this text structure. And if I now approach XML, the beauty of XML is that I can now suddenly make this implicit structure explicit. And the computer can then calculate with our texts. Because if we’re being honest, in the end, XML is not for us, but for the machine.
SO: Is it possible then that AI can discover structures that, for us humans, have so far only been expressed through XML?
SG: Yes. Well, I recently looked into Invisible XML. There you can overlay patterns onto unstructured text and they become visible as XML. Very clever. I think generative AI is a kind of Invisible XML on steroids. The rules aren’t as strict as in Invisible XML, but genAI also understands linguistic nuances. I found it very exciting, a customer of ours fed unstructured PDF content into ChatGPT; that is unstructured content from the PDF, in order to then convert it to XML. The AI was surprisingly good at discovering the invisible structure that was hidden in the content and converted XML really well. So that was impressive. When AI now appears to create information out of nothing, I think it is more likely that it makes existing but hidden information visible.
SO: Yes, I think the problem is that this hidden structure, in some documents, it’s there, but in others, there’s what we call “crap on a page” in English. So that’s, there’s no structure. And from one document to another, there is no consistency, so they are completely different. Writer 1 and Writer 2, they write and they never talk. And so if the AI now creates an entire chapter and an outline from a few keywords, how does it work? How does that fit together?
SG: Yes, you’re right. So far we’ve been talking about we take PDF and then XML is added to it. But if I’m put on the spot, I’ll throw in a few keywords and ChatGPT suddenly writes something. But also, I think this idea also applies that this is actually hidden information. It might sound a bit daring at first, but there’s nothing new, nothing completely surprising. Now if I just ask, let’s say ChatGPT, give me an outline for documentation for a piece of machinery. And then something comes out. I think most of our listeners would say the same thing. This is nothing new. This is hidden information contained in the training data, which is easily made visible through the query. Because ultimately, generative AI creates this information from my query and this huge amount of training data. And the answer is chosen so that it fits my query and the training data well. It creates a synthetic layer over the top. And in the end, the result is not net new information, but hopefully, the necessary information delivered in a way that’s easier to process further. Either like the example with PDF, enriched with XML or I maybe now have an outline. And I imagine it’s a bit like a juicer. The juicer doesn’t invent juice, it just extracts it from the oranges.
SO: Making information easier to process sounds almost like a job description for technical writers. And what about other methods? So if we now have metadata or knowledge graphs, what does that look like?
SG: That’s right, in addition to XML, these are also really important. So metadata, knowledge graphs. I find that metadata condenses information into a few data points and the knowledge graphs then create the relationships among these data points. And this is precisely why knowledge graphs, but also metadata, make invisible information visible. Because the connections that were previously implicit can now be understood through the knowledge graphs. And that can be easily combined with generative AI. At the beginning, the knowledge graph experts were a bit nervous, as you could tell at conferences, but now they’re actually pretty happy that they’ve discovered that generative AI plus knowledge graphs is much better than generative AI without knowledge graphs. And of course, that’s great. By the way, this isn’t the only trick where we have something in the technical documentation that helps generative AI get going. If you want to make large knowledge bases searchable with Large Language Models, you can do that today with RAG, or Retrieval Augmented Generation. And this means you can combine your own documents with a pre-trained model like ChatGPT very cost-effectively. If you now combine RAG with a faceted search, as we usually have in the content delivery portals in technical documentation, then the results are much better than with the usual vector search, because in the end it is just a better full-text search. That’s another possibility where structured information that we have can help jump-start AI.
SO: Is it your opinion that structured information will not become obsolete through AI, but will actually become more important?
SG: My impression is that the belief has taken hold that structured information is better for AI. I think we’re all a bit biased, naturally. We have to believe that. These are the fruits of our labor. It’s a bit like apples. The apple from an organic farmer is obviously healthier than the conventional apple from the supermarket. I think this is scientific fact. But in the end, any apple is better than a pack of gummy bears. And that’s what can be so disruptive about AI for us. Because at the end of the day, we are providing information. And if users gets information that is sufficient, that is good enough, why should they go the extra mile to get even better information? I don’t know.
SO: Okay, so I’m really interested in this gummy bear career and I want to hear a little bit more about that. But why is your view on the tech comm team’s role so, let’s say, pessimistic?
SG: I think my focus has gotten a little wider recently. I think I’m not really just looking at technical documentation. When it comes to technical documentation, we are lost without structured data. It will not work. But if we take the bigger picture, at Quanos we not only have an CCMS, but we also create a digital twin for information. I’m in all these working groups as the guy from the tech doc area. And I always have to accept that our particularly well-structured information from tech doc, the one with extra vitamins and secondary nutrients, is actually the exception out there when we look at the data silos that we want to combine in the info twin. When I was young, I believed that we had to convince others to work the way we do in tech docs. That would have been really fantastic. But if we’re honest with ourselves, it just doesn’t work. The advantages that XML provides for technical documentation are too small in the other areas and for individuals to justify a switch. The exceptions prove the rule. As a result, tons of information is out there locked up in these unstructured formats. And it can only be made accessible with AI. That will be the key.
SO: And how do we do that? If XML isn’t the right strategy, what does that look like?
SG: Well, so let’s take an example. So many of our customers build machinery and let’s take a look at the documentation that they supply. There are several dozen PDFs for each order. And of course the editor has a checklist and knows what to look for in this pile of PDFs. The test certificate, the maintenance table, parts lists, and so on. And even though the PDFs are completely “unstructured” as compared to XML files, we humans are able to extract the necessary information. And the exciting thing about it is that anyone can actually do it. So you don’t have to be a specialist in bottling systems or industrial pumps or sorting machines. If you have an idea of what a test certificate, a maintenance table, a parts list is, then you can find it. And here’s the kicker: the AI can do that too.
SO: Ahh. And so in this case are you more concerned with metadata…or something else?
SG: No, you’re right. So this is in fact about metadata and links. I find it fascinating what this does to our language usage. Because we have gotten used to saying that we enrich the content with metadata. But in many cases we have simply made the invisible structure explicit. No information was added. Nothing has become richer, just clearer. But now imagine that your supplier didn’t provide a maintenance table. Then you need to start reading, understand the maintenance instructions, and extract the necessary information. And that’s tedious. Even here, AI can still provide support. But how well depends on the clarity of maintenance procedures. The more specific background knowledge is necessary, the more difficult it becomes for the AI to provide assistance.
SO: What does that look like? Do you have an example or use case where AI doesn’t help at all?
SG: It depends on contextual knowledge. I once received parts of a risk analysis from a customer. And her question was, “Can you use AI to create safety messages?” And I said, “Sure, look at the risk analysis and then look at what the technical writers made of it.” And they were exemplary safety messages. But there was so little content in the risk analysis that with the best intentions in the world you couldn’t do anything with artificial intelligence; that end result was only possible because the technical writers had an incredibly good understanding of the product and also had the industry standards. The information was not hidden in this input, but in the contextual knowledge. And that’s so specialized that it’s of course not available in the Large Language Model.
SO: In this use case, you don’t see any possibility for AI at all?
SG: Well, at least not for a generic Large Language Model. So something like ChatGPT or Claude, they have no chance. There is an opportunity in AI to specialize these models again. You can fine-tune this with context-specific content. But we don’t yet know at the moment whether we normally have enough content. There are some initial experiments. But let’s think back to the water molecules. We need quite a few of them to make an iceberg or even a snowman. Ultimately, you have to ask which supporting materials are needed from which point of view, and fine-tuning is really expensive. So there are costs. It takes a long time. Performance is also an issue. And how practical is this approach? Do we have training data? So, given all these aspects, it is still unclear what the gold standard is for making a generic large language model usable for content work in very specific contexts. We just don’t know today.
SO: Can you already see or predict how generative AI will change or must change technical documentation?
SG: I really think it’s more like looking into my crystal ball. So it’s not that easy to estimate which use cases are promising for the use of AI in technical documentation. As a rule, you have a task where a textual input needs to be transformed into a textual output according to a certain standard. And it used to be garbage in, garbage out. In my opinion, the Large Language Models change this equation permanently. Input that we were previously unable to process automatically due to a lack of information density, we can now enrich it with universal contextual knowledge in such a way that it becomes processable. Missing information cannot be added. We’ve discussed that now. But these unspoken assumptions, in fact, we can pack them in. And that helps us in many places in technical documentation, because one of the ways good technical documentation differs from bad documentation is that fewer assumptions are necessary in order to understand the text or if you want to process it automatically. And that’s why I find condensing information instead of creating knowledge to be a kind of Occam’s Razor. I look at the assignment. If it’s simply a matter of making hidden information visible or putting it into a different form, then this is a good candidate for generative AI. What if it’s more about refining the information by using other sources of information? Then it becomes more difficult. If I now have this information, this other information in a knowledge graph, if it is already broken down there, then I can explicitly enrich the information before handing it over to the Large Language Model. And then it works again. But if the information, for example, the inherent product knowledge, is in the editor’s head, as was the case with my client’s risk analysis, then the Large Language Model simply has no chance. It won’t generate any added value. Then you may have to rethink your approach. Can you divide the task somehow? Maybe there is a part where this knowledge is not necessary, and I have an upstream or downstream process where I can optimize something with AI. And I think that’s the mother lode of opportunities lies. This art of distinguishing what is possible from what is impossible, and this will be more of a kind of engineering art, will be the factor in the coming years that will decide whether generative AI is of use to me or not.
SO: And what do you think? Of use, or not of use?
SG: I think we’ll figure it out. But it will take much longer than we think.
SO: Yes, I think that’s true. And so thank you very much, Sebastian. These are really very interesting perspectives and I’m looking forward to our next discussion, when in two weeks or three months there will be something completely new in AI and we’ll have to talk about it again, yes, what can we do today or what new things are available? So thank you very much and see you soon!
SG: … soon somewhere on this planet.
SO: Somewhere.
SG: Thank you for the invitation. Take care, Sarah.
SO: Yes, thank you, and many thanks to those listening, especially for the first time in the German-speaking areas. Further information about how we produced this podcast is available at scriptorium.com. Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Strategies for AI in technical documentation (podcast, English version) appeared first on Scriptorium.
Folge 169 ist auf Englisch und Deutsch verfügbar. Da unser Gast Sebastian Göttel sich im deutschsprachigen Raum mit KI beschäftigt, kam die Idee, diesen Podcast auf Deutsch zu erstellen. Die englische Version wurde dann mit KI-Unterstützung zusammengebastelt.
Sarah O’Keefe: Was hat die generative KI mit Gedichtinterpretationen zu tun?
Sebastian Göttel: Ja, nun, also oft hat man da ja den Eindruck, dass KI das Wissen schöpft, also Informationen aus dem Nichts erschafft. Und da ist die Frage, ist das denn wirklich so? Denn für die Germanisten ist es, glaube ich, schon eher normal, nicht nur den vorliegenden Text anzuschauen, sondern auch zwischen den Zeilen zu lesen, den kulturellen Subtext einfließen zu lassen. Und aus dem Blickwinkel der Germanisten, interpretiert oder rekonstruiert generative KI eigentlich nur Informationen, die schon vorhanden ist. Möglicherweise ist die verborgen, nur implizit angedeutet. Aber die wird durch die KI dann sichtbar.
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Die heutige Episode ist auf Englisch und Deutsch verfügbar. Da unser Gast sich im deutschsprachigen Raum mit KI beschäftigt, kam die Idee, diesen Podcast auf Deutsch zu erstellen. Die englische Version wurde dann mit KI-Unterstützung zusammengebastelt. Also herzlich willkommen zum Content Strategy Experts Podcast, heute zum ersten Mal auf Deutsch. Unser Thema ist heute Informationsverdichtung statt Wissensschöpfung. Strategien für KI in der technischen Dokumentation. Wir haben versucht, das alles in ein Wort zusammenzubringen, das hat aber nicht ganz geklappt. Welcome to the Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize and distribute content in an efficient way. In this episode, we talk about best practices for AI and TechCom with our guest Sebastian Göttel of Quanos. Hallo, ich heiße Sarah O’Keefe. Ich bin hier bei Scriptorium die Geschäftsführerin. Mein Gast ist Sebastian Göttel. Sebastian Göttel arbeitet seit über 25 Jahren im Bereich XML und Redaktionssysteme in der technischen Dokumentation. Ursprünglich hat er mal Informatik mit Schwerpunkt KI studiert. Aktuell ist er bei Quanos Product Manager für Schema ST4, einem der meistgenutzten Redaktionssysteme im Maschinen- und Anlagenbau in DACH. Er ist auch in der Tekom aktiv und hat unter anderem an der Version 1 des iiRDS-Standards mitgewirkt. Sebastian lebt mit Frau und Tochter, drei Katzen und zwei Mäusen vor den Toren von Nürnberg. Sebastian, herzlich willkommen. Ich freue mich auf diesen Austausch. Auf Englisch sagen wir ja create once, publish everywhere. Hier geht es um einmal aufnehmen und mehrfach ausgeben. Also, los geht’s. Sebastian, unser Thema ist heute, wie gesagt, die Informationsverdichtung anstatt von Wissensschöpfung. Und wie diese Strategie für KI in der technischen Dokumentation eingesetzt werden könnte. Also, bitte, erklär doch mal.
Sebastian Göttel: Ja, erstmal vielen Dank für die Einladung in den Podcast. Es ist ja gar nicht so einfach, eine 14-jährige Tochter zu beeindrucken. Und ich dachte mir, mit diesem Podcast habe ich eine Chance. Also habe ich ihr erzählt, dass ich demnächst in einem amerikanischen Podcast über KI sprechen werde. Und die Reaktion war ein bisschen anders, als ich mir das erwartet habe. Duuu wirst da Englisch sprechen? Man kann schon ziemlich viel Bedeutung in so ein einzelnes Duuu legen. Und von daher bin ich zum einen froh, dass ich hier Deutsch sprechen darf. Aber, und das ist jetzt die Überleitung zum Thema, was wird die KI aus dem “Duuu wirst da Englisch sprechen” machen? Wie will sie das beim Text-to-Speech korrekt aussprechen oder in eine andere Sprache übertragen? Und darum, glaube ich, wird es in unserem Gespräch heute gehen. Wenn wir verstehen wollen, wie KI uns versteht, aber auch wie wir sie in der technischen Dokumentation einsetzen können, dann müssen wir über Informationsverdichtung, aber auch unsichtbare Informationen sprechen. Duuu wirst da Englisch sprechen. Kann die KI rekonstruieren, dass meine Tochter mir das nicht zutraut beziehungsweise meinen deutschen Akzent im Englischen einfach grottig findet? Naja, also wenn die KI das rekonstruieren kann, ist es dann neue Information oder eigentlich eher Information, die schon da, war und die eigentlich im Gespräch sowohl Vater als auch Tochter bewusst war. Ich finde das ziemlich spannend, dass die Germanisten sich damit schon ganz häufig beschäftigt haben. Nämlich, was steht in so einem Text drin und was ist in dem Text gemeint? Was steht zwischen den Zeilen? Und wenn man so an seine Schulzeit zurückdenkt, dann fallen einem ja sofort diese Gedichtinterpretationen ein.
SO: Also Gedichte und was hat die generative KI mit Gedichtinterpretationen zu tun?
SG: Ja, nun, also oft hat man da ja den Eindruck, dass KI das Wissen schöpft, also Informationen aus dem Nichts erschafft. Und da ist die Frage, ist das denn wirklich so?
Denn für die Germanisten ist es, glaube ich, schon eher normal, nicht nur den vorliegenden Text anzuschauen, sondern auch zwischen den Zeilen zu lesen, den kulturellen Subtext einfließen zu lassen. Und aus dem Blickwinkel der Germanisten, interpretiert oder rekonstruiert generative KI eigentlich nur Informationen, die schon vorhanden ist. Möglicherweise ist die verborgen, nur implizit angedeutet. Aber die wird durch die KI dann sichtbar. Ui, hätte nie gedacht, dass ich mal in einem technischen Podcast mich auf Germanisten berufe.
SO: Ja, und ich auch nicht. Da bleibt aber doch die Frage, wie funktioniert das? Also wie funktioniert die KI und warum funktioniert das? Und wieso gibt es dann diese Probleme? Was ist denn heute unser Verständnis von der Lage?
SG: Also ich glaube, wir sind immer noch ziemlich beeindruckt von der generativen KI und wir versuchen noch zu begreifen, was wir da überhaupt wahrnehmen, was da passiert. Da gibt es Dinge, die lassen uns einfach den Kiefer runterklappen. Und dann gibt es wieder diese Epic Fails, wie vor kurzem diese Darstellung von Wehrmachtsoldaten von Gemini, der generativen KI von Google. Die Soldaten waren nämlich nach unserer heutigen Vorstellung politisch korrekt. Und da gab es dann unter anderem asiatisch aussehende Frauen mit Stahlhelm. Ich vergleiche das immer so ganz gern mit den Anfängen der Navigationssysteme. Da gab es ja auch immer diese Anekdoten in der Zeitung, dass wieder jemand in den Fluss gefahren ist, weil sein Navi die Fährlinie für eine Brücke gehalten hat. So einen Fehler konnte man im Navigationssystem relativ einfach fixen. Da war klar, warum das Navi den Fehler gemacht hat. Bei der generativen KI ist das leider nicht ganz so einfach. Wir wissen nicht, eigentlich, wir haben es noch nicht mal wirklich verstanden, wie diese teilweise intelligenten Leistungen zustande kommen.Die Epic Fails, die machen uns aber bewusst, dass es sich nicht um einen Algorithmus handelt, sondern um ein Phänomen, das scheinbar emergiert, wenn man viele Milliarden Texte in eine Matrix packt.
SO: Und was meinst du da mit emergiert? Was ist das denn?
SG: Das ist ein Begriff aus der Naturwissenschaft. Ich habe das mal mit Wassermolekülen verglichen. Ein einzelnes Wassermolekül ist nicht sonderlich spektakulär, aber wenn du zum Beispiel im Segelboot in einem Sturm auf dem Atlantik unterwegs bist oder auf einen Eisberg aufläufst, dann kriegst du eine andere Perspektive. Denn viele Wassermoleküle zusammengenommen zeigen ganz neues Verhalten. Und das nennt man Emersion. Und Physik und Chemie haben viele Jahrhunderte gebraucht, um das halbwegs zu enträtseln. Und ich denke, wir werden, vielleicht nicht ganz so lange, aber wir werden noch ein gutes Stück weiterforschen müssen bei der generativen KI, um auch da ein bisschen mehr zu verstehen, was da jetzt genau passiert. Und ich finde, die Epic Fails, die sollten uns bewusst machen, dass wir aktuell gut daran tun, unser Schicksal nicht blind in die Hände eines Large Language Models zu legen. Ich finde, der Ansatz Human in the Loop, wo die KI einen Vorschlag macht und dann ein Mensch nochmal drüber schaut, das bleibt bis auf weiteres der beste Modus. Und die Übersetzerbranche, die gefühlt der ganzen Welt ein paar Jahre voraus ist, wenn es um generative KI geht oder um neuronale Netze, die hat das ziemlich klug erkannt und gewinnbringend umgesetzt.
SO: Und wenn also jetzt die Übersetzung das Muster ist, was heißt das dann für generative KI und die technische Doku?
SG: Das ist eine gute Frage. Lass uns mal einen Schritt zurück machen. Also am Anfang meines Arbeitslebens, da war die Revolution in der technischen Dokumentation, das waren diese strukturierten Dokumente. SGML und XML. Und das kennt man jetzt also mittlerweile schon seit mehreren Jahrzehnten und es ist ja immer noch nicht in jeder Redaktion gebräuchlich. Und das heißt, wir haben jetzt diese strukturierten Dokumente und das andere, das sind die bösen unstrukturierten Dokumente. Und ich fand das schon immer so ein kleines bisschen einen Etikettenschwindel, denn unstrukturierte Dokumente sind ja in Wirklichkeit auch strukturiert. Also meistens zumindest. Da gibt es so eine Makro-Ebene, da habe ich ein Inhaltsverzeichnis, ein Titelblatt, ein Stichwortverzeichnis. Es gibt Kapitel. Dann gibt es Absätze, Listen und Tabellen und das geht dann runter bis auf die Satzebene. Da habe ich Aufzählungen, Aufforderungen und so weiter. Und nicht umsonst nennen das manche Linguisten ja Textstruktur. Und wenn ich jetzt mit XML rangehe, das Schöne daran an XML ist, dass ich diese implizite Struktur nun plötzlich explizit mache. Und damit kann dann der Computer mit unseren Texten rechnen. Denn wenn man ehrlich ist, am Ende ist XML nicht für uns, sondern für die Maschine.
SO: Kann es dann sein, dass die KI Strukturen entdecken kann, die für uns Menschen bis jetzt zwangsweise nur durch XML ausgedrückt wurden?
SG: Ja. Also ich habe mich da mal vor kurzem mit Invisible XML beschäftigt und da kann man über unstrukturierten Text Muster legen und die werden dann als XML sichtbar gemacht. Ganz clever. Und ich finde generative KI ist so eine Art Hochleistungs-Invisible XML. Also weil es zwar nicht so ganz strikt wie Invisible XML Regeln enthält, aber dafür auch sprachliche Nuancen versteht. Und ich fand es ganz spannend, ein Kunde von uns, der hat unstrukturierte PDF-Inhalte in Chat-GPT gefüttert, also unstrukturierte Inhalte aus dem PDF, um sie nach XML dann zu konvertieren. Und die KI hat erstaunlich gut die unsichtbare Struktur entdeckt, die in den Texten verborgen war und echt prima XML konvertiert. Also das war beeindruckend. Also wenn KI jetzt scheinbar Informationen aus dem Nichts schafft, dann ist es eben eher so, dass es existierende, aber verborgene Informationen sichtbar macht.
SO: Ja, ich glaube das Problem ist ja so, dass diese verborgene Struktur, also in manchen Dokumenten ist das da, aber in anderen da ist das, was wir auf Englisch, bei uns heißt das Crap on a Page. Das ist also, da gibt es keine Struktur. Und von einem Dokument zum anderen, da gibt es keine, also keine, die sind ganz anders. Also Redakteur 1 und Redakteur 2, die schreiben und die unterhalten sich niemals. Und also wenn die KI jetzt aus ein paar Stichworten ein ganzes Kapitel und eine Gliederung erstellt, wie geht das? Wie passt das zusammen?
SG: Ja, du hast recht. Jetzt haben wir die ganze Zeit drüber geredet: Wir nehmen PDF und dann wird da XML noch dazu gepackt. Aber wenn ich jetzt hier an der Stelle bin und sage, ich haue mal ein paar Stichworte rein und ChatGPT schreibt dann plötzlich etwas. Aber auch, ich finde auch da gilt dieser Gedanke, dass das eigentlich verborgene Information ist. Klingt vielleicht zuerst mal ein bisschen gewagt, aber da entsteht nichts Neues, nichts völlig Überraschendes. Wenn ich jetzt, sagen wir mal ChatGPT, einfach frage, gib mir mal eine Gliederung für eine Maschinen-Dokumentation. Und dann kommt da was raus. Ich denke, das würden die meisten von unseren Zuhörern genauso hinschreiben. Das ist nichts Neues. Das ist versteckte Information, die in den Trainingsdaten steckt, die durch die Anfrage einfach sichtbar gemacht werden. Denn letztendlich erstellt die generative KI diese Information aus meiner Anfrage und dieser riesigen Menge an Trainingsdaten. Und die Antwort, die ist so gewählt, dass sie gut zu meiner Anfrage und den Trainingsdaten passt. Die, ja, legt sich so ein bisschen wie so ein Layer da drüber, sodass das einfach gut das simuliert. Und am Ende habe ich damit dann keine neue Information, sondern hoffentlich die benötigte Information in einer besser verarbeitbaren Form. Entweder wie vorhin beim Beispiel mit dem PDF, mit XML angereichert oder ich habe jetzt eine Gliederung. Und ein bisschen stelle ich mir das vor wie so bei einer Saftpresse. Ja, die erfindet den Saft ja auch nicht, sondern die holt das aus den Orangen einfach raus.
SO: Informationen besser verarbeitbar zu machen, also das klingt doch fast schon wie eine Tätigkeitsbeschreibung für technische Redakteure. Und was ist mit anderen Methoden? Also wenn wir jetzt Metadaten oder Knowledge Graphs haben, wie sieht das denn da aus?
SG: Stimmt, das ist neben XML natürlich auch total wichtig. Also Metadaten, Knowledge Graphen. Ich finde Metadaten, die verdichten Informationen auf wenige Datenpunkte und die Knowledge Graphen, die machen dann die Beziehungen zwischen diesen Datenpunkten. Und gerade dadurch machen Knowledge Graphen, aber auch Metadaten ja unsichtbare Informationen sichtbar. Denn die Zusammenhänge, die vorher implizit wahr waren, die können jetzt durch die Knowledge Graphen nachvollzogen werden. Und das lässt sich prima mit generativer KI kombinieren. Am Anfang waren die Knowledge Graph Experten ein bisschen nervös, das konnte man merken auf Konferenzen, aber jetzt sind sie eigentlich ziemlich froh, dass sie festgestellt haben, generative KI plus Knowledge Graphen, das ist viel besser als generative KI ohne Knowledge Graphen. Und das ist natürlich prima. Das ist übrigens nicht der einzige Trick, wo wir in der technischen Dokumentation etwas haben, was der generativen KI auf die Sprünge hilft. Wenn man mit Large Language Models große Wissensbasen durchsuchbar machen will, dann macht man das ja heutzutage mit RAG, also Retrieval Augmented Generation. Und damit kann man sehr kostengünstig eigene Dokumente mit einem vortrainierten Modell wie ChatGPT kombinieren. Und kombiniert man jetzt RAG mit einer Facettensuche, so wie wir das in den Content Delivery Portalen in der TechDoc normalerweise haben, dann sind die Ergebnisse viel besser als mit der üblichen Vektorsuche, denn die ist am Ende ja nur eine bessere Volltextsuche. Und das ist dann auch wieder eine Möglichkeit, wo strukturierte Informationen, die wir eben haben, der KI auf die Sprünge hilft.
SO: Also bist du dann auch der Meinung, dass die strukturierte Information, durch KI nicht obsolet wird, sondern sogar noch wichtiger wird?
SG: Ich habe schon den Eindruck, dass sich so ein bisschen der Glaube durchgesetzt hat, strukturierte Informationen sind besser für KI. Ein bisschen sind wir dann natürlich, glaube ich, alle biased. Also wir müssen das glauben. Das sind ja die Früchte unserer Arbeit. Ein bisschen ist es auch so, also genauso wie der Apfel vom Biobauern natürlich gesünder ist, als der konventionelle Apfel aus dem Supermarkt. Ich denke, das ist wissenschaftlich klar erwiesen. Aber am Ende ist ein Apfel immer besser als eine Packung Gummibärchen. Und das ist es, was bei KI so disruptiv sein kann für uns. Denn am Ende machen wir Informationsvermittlung. Und wenn der Anwender Informationen bekommt, die ausreicht, die gut genug ist, warum sollte er dann noch die Extra-Meile gehen, um noch bessere Informationen zu bekommen? Ich weiß nicht.
SO: Ja, also ich interessiere mich wirklich an dieser, also Gummibärchen-Karriere. Da will ich mal ein bisschen mehr hören. Aber warum ist das denn so, sagen wir mal, pessimistisch für die Redaktion von dir?
SG: Ich glaube, mein Bild ist ein bisschen größer geworden in der letzten Zeit. Ich glaube, da geht es mir gar nicht so sehr um die technische Dokumentation. In der technischen Dokumentation sind wir ohne strukturierte Daten aufgeschmissen. Das wird nicht funktionieren.
Aber wenn wir das größere Bild machen, bei Quanos haben wir ja nicht nur ein Redaktionssystem, sondern wir machen auch so einen digitalen Informationszwilling. Und dann sitze ich immer in diesen Arbeitskreisen drin als der Typ aus dem Tech-Doc-Bereich. Und da muss ich immer hinnehmen, dass unsere besonders gut strukturierten Informationen aus der Tech-Doc, also die mit den besonders viel Vitaminen und sekundären Pflanzenstoffen, das ist halt doch in der Realität da draußen eher die Ausnahme, wenn wir uns die Datensilos angucken, die wir im Info-Twin zusammenfahren wollen. Und als ich jung war, da habe ich noch dran geglaubt, dass wir die anderen davon überzeugen müssen, auch so zu arbeiten wie in der technischen Dokumentation. Das wäre doch echt prima gewesen. Aber wenn wir ehrlich sind, es klappt halt nicht. Die Vorteile, die wir in der technischen Dokumentation dank XML haben, die sind in den anderen Bereichen für die einzelnen Kollegen zu klein, als dass sie umsteigen wollen. Also Ausnahmen bestätigen die Regel. Das bedeutet, da draußen gibt es Tonnen von Informationen, die in diesen unstrukturierten Formaten eingesperrt sind. Und die können nur mit KI zugänglich gemacht werden. Das wird der Schlüssel sein.
SO: Und wie machen wir das? Also wenn jetzt XML da nicht der richtige Pfad ist, dann wie sieht das aus?
SG: Naja, also nehmen wir ein Beispiel. Also viele unserer Kunden sind ja Maschinenanlagenbauer und gucken wir mal auf die Zulieferdokumentation. Da kommen für einen Auftrag, mehrere Dutzende PDF. Und natürlich hat die Redakteurin dann so eine Checkliste und sie weiß, was sie in diesem Haufen PDF suchen muss. Das Prüfzertifikat, die Wartungstabelle, Ersatzteillisten und so weiter. Und obwohl die PDFs ja komplett unstrukturiert sind, also dieses unstrukturiert, wie wir halt als XML-Leute das dann so nennen, sind wir Menschen in der Lage, diese Informationen zu extrahieren. Und das Spannende daran, eigentlich kann das jeder. Also dafür muss man kein Spezialist für Abfüllanlagen oder Industriepumpen oder Sortiermaschinen sein. Wenn du eine Vorstellung davon hast, was ein Prüfzertifikat, eine Wartungstabelle, eine Ersatzteilliste ist, dann findest du die. Und jetzt kommt’s. Dann kann die KI das nämlich auch.
SO: Aha. Und also geht es dir in diesem Fall eher um Metadaten oder um was anderes?
SG: Nee, du hast schon recht. Also es geht hier in der Tat um Metadaten und Verlinkungen.
Ich finde das spannend, was das mit unserem Sprachgebrauch macht. Denn wir haben uns ja so angewöhnt zu sagen, wir reichern die Inhalte mit Metadaten an. Aber in vielen Fällen haben wir einfach nur die unsichtbare Struktur explizit gemacht. Da ist gar keine Information dazugekommen. Da ist nichts reicher geworden, sondern einfach nur klarer. Aber jetzt stell dir mal vor, dein Zulieferer hat keine Wartungstabelle geliefert. Dann musst du anfangen, die Wartungsarbeiten zu lesen, zu verstehen und die notwendigen Informationen zu extrahieren. Und das ist ziemlich mühsam. Selbst hier kann dann die KI noch unterstützen. Aber wie gut, hängt dann schon davon ab, wie verständlich die Wartungstätigkeiten beschrieben sind. Und umso mehr spezifisches Hintergrundwissen notwendig ist, umso schwieriger wird es, für die KI hilfreich zuzuarbeiten.
SO: Und wie sieht das denn aus? Also hast du ein Beispiel oder ein Use Case, wo die KI gar nicht weiterhilft?
SG: Wie schon gesagt, das hängt natürlich dann vom Kontextwissen ab. Ich hatte von einer Kundin mal Teile der Risikoanalyse bekommen. Und da ging es darum, kann man daraus mit KI Sicherheitshinweise erstellen? Und ich habe dann gesagt, ja klar, guck mal die Risikoanalyse an und dann guck mal an, was die Redakteure daraus gemacht haben. Und es waren mustergültige Sicherheitshinweise. Aber es stand so wenig in der Risikoanalyse drin, dass beim besten Willen konnte man da nichts mit Künstlicher Intelligenz machen, sondern das ging nur, weil die Redakteure ein wahnsinnig gutes Produktverständnis hatten und auch noch die Normen im Hinterkopf hatten, die dafür notwendig waren. Da war eben die Information nicht in diesem Input versteckt, sondern im Kontextwissen. Und das ist so speziell, das ist natürlich auch nicht im Large Language Model vorhanden.
SO: In so einer Anwendung oder in so einem Anwendungsfall siehst du dann überhaupt keine Möglichkeit für KI?
SG: Also zumindest nicht für ein generisches Large Language Model. Also sowas wie ChatGPT oder Claude, die sind da chancenlos. Es gibt die Möglichkeit in der KI, diese Modelle nochmal zu spezialisieren. Man kann die ja mit kontextspezifischen Texten feintunen. Aber ob wir da im Normalfall ausreichend Texte haben, weiß man im Moment noch nicht so. Da gibt es die ersten Experimente. Aber denken wir nochmal zurück an die Wassermoleküle. Für einen Eisberg oder schon für einen Schneemann brauchen wir ziemlich viele davon. Also heute ist letztendlich so, welche Hilfsmittel unter den Gesichtspunkten dann auch, also Feintuning ist echt teuer. Also Kosten. Dauert lange. Also auch Performance ist ein Thema. Und wie praktikabel ist das? Haben wir Trainingsdaten? Also unter diesen ganzen Aspekten, was da jetzt wirklich der goldene Weg ist, um so ein generisches Large Language Model für die Textarbeit für sehr spezifische Kontexte brauchbar zu machen, ist einfach noch unklar. Weiß man heute einfach nicht.
SO: Kannst du denn heute schon sehen oder voraussehen, wie die generative KI die technische Dokumentation verändern wird oder muss?
SG: Ich finde das noch echt mehr so einen Blick in die Kristallkugel. Also das ist noch gar nicht so einfach einzuschätzen, welche Use Cases jetzt für den Einsatz von KI in der technischen Dokumentation vielversprechend sind. In der Regel hast du eine Aufgabenstellung, wo ein textueller Input nach einer bestimmten Maßgabe in einen textuellen Output transformiert werden soll. Und früher galt da Garbage in, Garbage out. Die Large Language Models nach meiner Meinung verändern diese Gleichung nachhaltig. Input, den wir mangels Informationsdichte früher nicht automatisch verarbeiten konnten, den können wir jetzt durch universelles Kontextwissen anreichern, so anreichern, dass er verarbeitbar wird. Fehlende Informationen können nicht ergänzt werden. Das haben wir ja jetzt besprochen. Aber diese unausgesprochenen Annahmen, in der Tat, die können wir mit reinpacken. 6Und das hilft uns in der technischen Dokumentation an vielen Stellen, weil sich eine gute technische Dokumentation ja unter anderem dadurch von einer schlechten unterscheidet, dass weniger Annahmen notwendig sind, um den Text zu verstehen, beziehungsweise auch, wenn man ihn maschinell verarbeiten will. Und deshalb finde ich Informationsverdichtung statt Wissenschöpfung für mich so eine Art Ockhamsches Messer. Ich betrachte mir die Ausgabenstellung. Geht es jetzt einfach nur darum, verborgene Informationen sichtbar zu machen oder sie in eine andere Form zu bringen, dann ist das einfach ein guter Kandidat für den Einsatz von generativer KI. Oder geht es jetzt eher darum, durch den Rückgriff auf andere Informationsquellen die Informationen zu veredeln? Dann wird es schon schwieriger. Wenn ich jetzt diese Informationen, diese anderen Informationen in einem Knowledge Graphen habe, wenn die dort schon aufgeschlüsselt sind, dann kann ich ja die Informationen explizit vor der Übergabe an das Large Language Model anreichern. Und dann geht das auch wieder. Wenn aber die Informationen, zum Beispiel das inhärente Produktwissen im Kopf des Redakteurs ist, wie bei der Risikoanalyse meiner Kundin, dann hat das Large Language Model einfach keine Chance. Das wird da keinen Mehrwert generieren. Dann muss man eventuell nochmal überlegen, kann man die Aufgabenstellung noch irgendwie aufteilen? Vielleicht gibt es einen Teil, wo dieses Wissen nicht notwendig ist und ich habe einen vor- oder nachgelagerten Prozessschritt, wo ich mit der KI was optimieren kann. Und ich finde, da wird in der Zukunft die Musik spielen. Diese Kunst, das Machbare vom Unmachbaren zu unterscheiden, und das wird eher so eine Art Ingenieurskunst sein, das wird in den kommenden Jahren der Faktor sein, der entscheidet, ob die generative KI mir einen Nutzen stiftet oder nicht.
SO: Und was glaubst du, mehr oder nicht?
SG: Ich glaube, wir werden das raustüfteln. Aber es wird viel länger dauern, als wir das glauben.
SO: Ja, also ich glaube, das stimmt.
Und also vielen Dank, Sebastian. Das sind wirklich ganz interessante Perspektiven und ich freue mich auf unsere nächste Diskussion, wenn in so zwei Wochen oder drei Monaten was ganz Neues da in der KI ist und wir uns noch mal darüber unterhalten müssen, ja, was können wir denn heute machen oder jetzt machen? Also vielen Dank und wir sehen uns …
SG: … demnächst irgendwo auf diesem Planeten.
SO: Irgendwo.
SG: Vielen Dank für die Einladung. Mach’s gut, Sarah.
SO: Ja, und vielen Dank an die Zuhörenden, besonders zum ersten Mal im deutschen Raum. Weitere Informationen sind bei scriptorium.com verfügbar. Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links. Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Strategien für KI in der technischen Dokumentation (podcast, Deutsche version) appeared first on Scriptorium.
The podcast currently has 183 episodes available.
111,488 Listeners
20 Listeners